1
|
Cantrell A, Booth A, Chambers D. A systematic review case study of urgent and emergency care configuration found citation searching of Web of Science and Google Scholar of similar value. Health Info Libr J 2024; 41:166-181. [PMID: 35289476 DOI: 10.1111/hir.12428] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 09/29/2021] [Accepted: 10/05/2021] [Indexed: 02/05/2023]
Abstract
BACKGROUND Supplementary search methods, including citation searching, are essential if systematic reviews are to avoid producing biased conclusions. Little evidence exists on how to prioritise databases for citation searching or to establish whether using multiple sources is beneficial. OBJECTIVES A systematic review examining urgent and emergency care reconfiguration was used to investigate the utility of citation searching on Web of Science (WOS) and/or Google Scholar (GS). METHODS This case study investigated numbers of studies, additional studies and unique studies retrieved from both sources. In addition, the time to search, the ease of adding references to reference management software and obtaining abstracts of studies for screening are briefly considered. RESULTS WOS retrieved 62 references after deduplication of the results, 52 being additional references not retrieved during the database searching. GS retrieved 134 unique references with 63 additional references. WOS and GS retrieved the same three additional included studies. WOS was less time intensive to search given the facility to restrict to English language papers and availability of abstracts. CONCLUSIONS In a single systematic review case study, citation searching was required to identify all included studies. Citation searching on WOS is more efficient, where a subscription is available. Both databases identified the same studies but GS required additional time to remove non-English language studies and locate abstracts.
Collapse
Affiliation(s)
- Anna Cantrell
- Health Economics and Decision Science Section, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Andrew Booth
- Health Economics and Decision Science Section, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Duncan Chambers
- Public Health Section, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| |
Collapse
|
2
|
Briscoe S, Rogers M. An alternative screening approach for Google Search identifies an accurate and manageable number of results for a systematic review (case study). Health Info Libr J 2024; 41:149-155. [PMID: 34734655 DOI: 10.1111/hir.12409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 08/02/2021] [Accepted: 10/13/2021] [Indexed: 11/30/2022]
Abstract
BACKGROUND A challenge when using Google Search to identify studies for a systematic review is managing the high number of results, which can number in the hundreds of thousands or even more. Studies and guidance on web searching suggest limiting the screening process, e.g. to the first 100 results. OBJECTIVES Our objective in this case study is to demonstrate an alternative approach to screening the results retrieved by Google Search which is based on our experience that the viewable number of results is often far fewer than the estimated number calculated by the search engine. METHODS We screened the results of three searches of Google Search using our approach, which involves increasing the number of results displayed per page from 10 to the maximum of 100. We then calculated the viewable number of results and compared this with the estimated number. RESULTS The mean of the estimated number of results for the three searches was 569,454,000. The mean of the viewable number results was 463 (0.00008% of the mean of the estimated number of results). CONCLUSION Our findings challenge the commonly reported view that the number of results retrieved when using Google Search is too high to screen in full.
Collapse
Affiliation(s)
- Simon Briscoe
- University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Morwenna Rogers
- University of Exeter Medical School, University of Exeter, Exeter, UK
| |
Collapse
|
3
|
Morel T, Nguyen-Soenen J, Thompson W, Fournier JP. Development and validation of search filters to retrieve medication discontinuation articles in Medline and Embase. Health Info Libr J 2024; 41:156-165. [PMID: 38013506 DOI: 10.1111/hir.12516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 08/08/2023] [Accepted: 11/10/2023] [Indexed: 11/29/2023]
Abstract
BACKGROUND Medication discontinuation studies explore the outcomes of stopping a medication compared to continuing it. Comprehensively identifying medication discontinuation articles in bibliographic databases remains challenging due to variability in terminology. OBJECTIVES To develop and validate search filters to retrieve medication discontinuation articles in Medline and Embase. METHODS We identified medication discontinuation articles in a convenience sample of systematic reviews. We used primary articles to create two reference sets for Medline and Embase, respectively. The reference sets were equally divided by randomization in development sets and validation sets. Terms relevant for discontinuation were identified by term frequency analysis in development sets and combined to develop two search filters that maximized relative recalls. The filters were validated against validation sets. Relative recalls were calculated with their 95% confidences intervals (95% CI). RESULTS We included 316 articles for Medline and 407 articles for Embase, from 15 systematic reviews. The Medline optimized search filter combined 7 terms. The Embase optimized search filter combined 8 terms. The relative recalls were respectively 92% (95% CI: 87-96) and 91% (95% CI: 86-94). CONCLUSIONS We developed two search filters for retrieving medication discontinuation articles in Medline and Embase. Further research is needed to estimate precision and specificity of the filters.
Collapse
Affiliation(s)
- Thomas Morel
- Département de Médecine Générale, Nantes Université, Nantes, France
- SPHERE-UMR INSERM 1246, Nantes Université, Université de Tours, Nantes, France
| | - Jérôme Nguyen-Soenen
- Département de Médecine Générale, Nantes Université, Nantes, France
- SPHERE-UMR INSERM 1246, Nantes Université, Université de Tours, Nantes, France
| | - Wade Thompson
- Department of Anesthesiology, Pharmacology, and Therapeutics, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jean-Pascal Fournier
- Département de Médecine Générale, Nantes Université, Nantes, France
- SPHERE-UMR INSERM 1246, Nantes Université, Université de Tours, Nantes, France
| |
Collapse
|
4
|
Jean-Pierre P, Nouri K. Skin of color representation of skin cancer is minimal on the internet: a Google images search analysis. Arch Dermatol Res 2024; 316:132. [PMID: 38662048 DOI: 10.1007/s00403-024-02894-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 03/20/2024] [Accepted: 04/16/2024] [Indexed: 04/26/2024]
Affiliation(s)
- Philippe Jean-Pierre
- Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami, 1150 NW 14th Street, Suite 500, Miami, FL, 33136, USA.
| | - Keyvan Nouri
- Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami, 1150 NW 14th Street, Suite 500, Miami, FL, 33136, USA
| |
Collapse
|
5
|
Ahmad UF, Mahdee J, Abu Bakar N. Search engine optimisation (SEO) strategy as determinants to enhance the online brand positioning. F1000Res 2024; 11:714. [PMID: 38708191 PMCID: PMC11066527 DOI: 10.12688/f1000research.73382.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/19/2024] [Indexed: 05/07/2024] Open
Abstract
Background Marketers face evolution of online brand positioning marketing strategy due to changes of search engine algorithm that affects the reaching out of brands to potential internet users. Brand owners realise that to be relevant in modern market, they need to transition and focus more into online market. However, many brand owners have ignored the power of search engine optimisation (SEO) strategy for attracting the online market, which is highly competitive and faces rapid changes. A brand can be considered as old fashioned if it does not utilise the SEO as their marketing strategy, in penetrating the online marketplace. Various studies have analysed factors that can enhance the persistency of using the SEO strategy, however gaps remain regarding the relationship of this strategy with the online brand positioning. The main aim of this study was to identify the persistency of using the SEO strategy including the niche point of differentiation, valuable content, targeted keyword and scalable link building, as the determinants that enhance the success of online brand positioning. Methods This study applies quantitative design using online survey to gather information from the online business entrepreneurs. The survey questionnaire was arranged to focus on the use of SEO as a new way to strategise online business. Results Based on the results of this study, most online entrepreneurs have somewhat realised the effects of using the SEO strategy to enhance effectiveness of online brand positioning. Conclusion This research provides insights into the importance of SEO strategy in online business positioning. It is hoped that online entrepreneurs will consider the SEO strategy in the positioning of their brand in the marketplace. Implication This research focused on SEO as a new strategy to enhance brand positioning for online businesses. Future research may expand into another dimension of business such as customer satisfaction and business performance.
Collapse
Affiliation(s)
- Umar Faruq Ahmad
- Corporate Communication Division, Ministry of Home Affairs, Federal Government Administrative Centre, Putrajaya, Wilayah Persekutuan, 62546, Malaysia
| | - Junainah Mahdee
- Faculty of Management, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia
| | | |
Collapse
|
6
|
Dabbous A, Horn M, Croutzet A. Measuring environmental awareness: An analysis using google search data. J Environ Manage 2023; 346:118984. [PMID: 37717397 DOI: 10.1016/j.jenvman.2023.118984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/23/2023] [Accepted: 09/09/2023] [Indexed: 09/19/2023]
Abstract
Environmental awareness is usually measured using surveys. This paper aims to offer an alternative measure: an Environmental Awareness Index (EAI) constructed using Google search data provided by Google Trends. The benefits of using Google search data over surveys are that (i) they are less costly to obtain, (ii) they are available at high frequency, and (iii) they cover countries where no surveys are available. To test the validity of the proposed EAI, this study empirically assesses the impact of the computed index on individuals' pro-environmental behaviors using the Special Eurobarometer: Attitudes of European citizens towards the Environment data. Results show that the EAI is positively related to pro-environmental behaviors with a statistical significance at the one percent level. This finding stays robust in pooled OLS as well as in panel regression analysis when GDP, mean years of schooling, and population are included as control variables and when time-fixed effects are introduced. Further, the results confirm that environmental awareness is not stable over time and underline the importance of having a timely measure of environmental awareness at hand. Finally, the findings offer several practical implications for managers and policymakers, who will be able to use a timely measure of environmental awareness, assess and measure the impact of their policies aiming to raise environmental awareness as well as depict the type of behavior influenced by their policies.
Collapse
|
7
|
Sindhoo Z, Sindhoo S, Ghosh A, Kaur S, Soyiri I, Ahmadi K. Online Interest for Electronic Cigarettes Using Google Trends in the UK: A Correlation Analysis. Subst Use Misuse 2023; 58:1791-1797. [PMID: 37671780 DOI: 10.1080/10826084.2023.2247056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
BACKGROUND Google Trends provides an easily accessible and cost-effective method of providing real-time insight into user interest. OBJECTIVE to address the gap in UK prevalence data for e-cigarettes by analyzing Google Trends to identify correlations with official data from Action on Smoking and Health. The study further evaluates Google Trend's sensitivity to real-time events and the ability for predictive models to forecast future data based on Google Trends. METHODS UK Google Trends data from 2012 to 2021 was analyzed to assess (a) the most popular electronic nicotine device terminology; (b) statistically significant points in time; (c) correlations between Relative Search Volumes and official reports on electronic cigarette use and (d) whether Google Trends could predict future patterns in data. These were achieved using Locally Weighted Scatterplot Smoothing regression, Pruned Exact Linear Time Method, cross correlation, and Autoregressive Integrated Moving Average algorithms respectively. RESULTS "Vape" was revealed to be the most popular electronic nicotine device terminology with a correlation coefficient greater than +0.9 when compared to official electronic cigarette consumption data within a one-year timescale (lag 0). Results from ARIMA modeling were varied with the algorithms forecasted trends line occasionally lying outside of a 95% prediction interval. CONCLUSION Google Trends may correspond to population-based prevalence of electronic cigarette use. The changing trends coincide with changing policy decisions. Google Trends based prediction for online interest in electronic cigarettes requires further validation so should currently be used in conjunction with other traditional methods of data collections.
Collapse
Affiliation(s)
- Zainah Sindhoo
- Lincoln Medical School, Universities of Nottingham and Lincoln, Nottingham, UK
| | | | - Abhishek Ghosh
- Department of Psychiatry, Postgraduate Institute Medical Education and Research, Chandigarh, India
| | - Simranjit Kaur
- Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, Punjab, India
| | | | - Keivan Ahmadi
- Department of Primary Care and Public Health, School of Public Health, Faculty of Medicine, Imperial College London, NIHR Applied Research Collaboration Northwest London (ARC NWL), London, UK
| |
Collapse
|
8
|
Birklbauer MJ, Matzinger M, Müller F, Mechtler K, Dorfer V. MS Annika 2.0 Identifies Cross-Linked Peptides in MS2-MS3-Based Workflows at High Sensitivity and Specificity. J Proteome Res 2023; 22:3009-3021. [PMID: 37566781 PMCID: PMC10476269 DOI: 10.1021/acs.jproteome.3c00325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Indexed: 08/13/2023]
Abstract
Cross-linking mass spectrometry has become a powerful tool for the identification of protein-protein interactions and for gaining insight into the structures of proteins. We previously published MS Annika, a cross-linking search engine which can accurately identify cross-linked peptides in MS2 spectra from a variety of different MS-cleavable cross-linkers. In this publication, we present MS Annika 2.0, an updated version implementing a new search algorithm that, in addition to MS2 level, only supports the processing of data from MS2-MS3-based approaches for the identification of peptides from MS3 spectra, and introduces a novel scoring function for peptides identified across multiple MS stages. Detected cross-links are validated by estimating the false discovery rate (FDR) using a target-decoy approach. We evaluated the MS3-search-capabilities of MS Annika 2.0 on five different datasets covering a variety of experimental approaches and compared it to XlinkX and MaXLinker, two other cross-linking search engines. We show that MS Annika detects up to 4 times more true unique cross-links while simultaneously yielding less false positive hits and therefore a more accurate FDR estimation than the other two search engines. All mass spectrometry proteomics data along with result files have been deposited to the ProteomeXchange consortium via the PRIDE partner repository with the dataset identifier PXD041955.
Collapse
Affiliation(s)
- Micha J. Birklbauer
- Bioinformatics
Research Group, University of Applied Sciences
Upper Austria, Softwarepark
11, 4232 Hagenberg, Austria
| | - Manuel Matzinger
- Research
Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
| | - Fränze Müller
- Research
Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
| | - Karl Mechtler
- Research
Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
- Institute
of Molecular Biotechnology (IMBA), Austrian Academy of Sciences, Vienna
BioCenter (VBC), Dr.
Bohr-Gasse 3, 1030 Vienna, Austria
- Gregor
Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter
(VBC), Dr. Bohr-Gasse
3, 1030 Vienna, Austria
| | - Viktoria Dorfer
- Bioinformatics
Research Group, University of Applied Sciences
Upper Austria, Softwarepark
11, 4232 Hagenberg, Austria
| |
Collapse
|
9
|
Heidt A. Artificial-intelligence search engines wrangle academic literature. Nature 2023; 620:456-457. [PMID: 37550446 DOI: 10.1038/d41586-023-01907-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
|
10
|
|
11
|
Nowatzky Y, Benner P, Reinert K, Muth T. Mistle: bringing spectral library predictions to metaproteomics with an efficient search index. Bioinformatics 2023; 39:btad376. [PMID: 37294786 PMCID: PMC10313348 DOI: 10.1093/bioinformatics/btad376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 05/11/2023] [Accepted: 06/08/2023] [Indexed: 06/11/2023] Open
Abstract
MOTIVATION Deep learning has moved to the forefront of tandem mass spectrometry-driven proteomics and authentic prediction for peptide fragmentation is more feasible than ever. Still, at this point spectral prediction is mainly used to validate database search results or for confined search spaces. Fully predicted spectral libraries have not yet been efficiently adapted to large search space problems that often occur in metaproteomics or proteogenomics. RESULTS In this study, we showcase a workflow that uses Prosit for spectral library predictions on two common metaproteomes and implement an indexing and search algorithm, Mistle, to efficiently identify experimental mass spectra within the library. Hence, the workflow emulates a classic protein sequence database search with protein digestion but builds a searchable index from spectral predictions as an in-between step. We compare Mistle to popular search engines, both on a spectral and database search level, and provide evidence that this approach is more accurate than a database search using MSFragger. Mistle outperforms other spectral library search engines in terms of run time and proves to be extremely memory efficient with a 4- to 22-fold decrease in RAM usage. This makes Mistle universally applicable to large search spaces, e.g. covering comprehensive sequence databases of diverse microbiomes. AVAILABILITY AND IMPLEMENTATION Mistle is freely available on GitHub at https://github.com/BAMeScience/Mistle.
Collapse
Affiliation(s)
- Yannek Nowatzky
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| | - Philipp Benner
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| | - Knut Reinert
- Department of Mathematics and Computer Science, FU Berlin, Berlin 14195, Germany
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Thilo Muth
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| |
Collapse
|
12
|
Gusenbauer M. Audit AI search tools now, before they skew research. Nature 2023; 617:439. [PMID: 37193815 DOI: 10.1038/d41586-023-01613-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
|
13
|
Gusenbauer M. A free online guide to researchers' best search options. Nature 2023; 615:586. [PMID: 36944742 DOI: 10.1038/d41586-023-00845-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
|
14
|
Cormican JA, Soh WT, Mishto M, Liepe J. iBench: A ground truth approach for advanced validation of mass spectrometry identification method. Proteomics 2023; 23:e2200271. [PMID: 36189881 PMCID: PMC10078205 DOI: 10.1002/pmic.202200271] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/23/2022] [Accepted: 09/28/2022] [Indexed: 01/19/2023]
Abstract
The discovery of many noncanonical peptides detectable with sensitive mass spectrometry inside, outside, and on cells shepherded the development of novel methods for their identification, often not supported by a systematic benchmarking with other methods. We here propose iBench, a bioinformatic tool that can construct ground truth proteomics datasets and cognate databases, thereby generating a training court wherein methods, search engines, and proteomics strategies can be tested, and their performances estimated by the same tool. iBench can be coupled to the main database search engines, allows the selection of customized features of mass spectrometry spectra and peptides, provides standard benchmarking outputs, and is open source. The proof-of-concept application to tryptic proteome digestions, immunopeptidomes, and synthetic peptide libraries dissected the impact that noncanonical peptides could have on the identification of canonical peptides by Mascot search with rescoring via Percolator (Mascot+Percolator).
Collapse
Affiliation(s)
- John A. Cormican
- Max‐Planck‐Institute for Multidisciplinary Sciences (MPI‐NAT)GöttingenGermany
| | - Wai Tuck Soh
- Max‐Planck‐Institute for Multidisciplinary Sciences (MPI‐NAT)GöttingenGermany
| | - Michele Mishto
- Centre for Inflammation Biology and Cancer Immunology (CIBCI) & Peter Gorer Department of ImmunobiologyKing's College LondonLondonUK
- The Francis Crick InstituteLondonUK
| | - Juliane Liepe
- Max‐Planck‐Institute for Multidisciplinary Sciences (MPI‐NAT)GöttingenGermany
| |
Collapse
|
15
|
Kotelnikova E, Frahm KM, Shepelyansky DL, Kunduzova O. Fibrosis Protein-Protein Interactions from Google Matrix Analysis of MetaCore Network. Int J Mol Sci 2021; 23:67. [PMID: 35008491 PMCID: PMC8744902 DOI: 10.3390/ijms23010067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 12/15/2021] [Accepted: 12/16/2021] [Indexed: 02/06/2023] Open
Abstract
Protein-protein interactions is a longstanding challenge in cardiac remodeling processes and heart failure. Here, we use the MetaCore network and the Google matrix algorithms for prediction of protein-protein interactions dictating cardiac fibrosis, a primary cause of end-stage heart failure. The developed algorithms allow identification of interactions between key proteins and predict new actors orchestrating fibroblast activation linked to fibrosis in mouse and human tissues. These data hold great promise for uncovering new therapeutic targets to limit myocardial fibrosis.
Collapse
Affiliation(s)
| | - Klaus M. Frahm
- Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France;
| | - Dima L. Shepelyansky
- Laboratoire de Physique Théorique, IRSAMC, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France;
| | - Oksana Kunduzova
- National Institute of Health and Medical Research (INSERM) U1048, CEDEX 4, 31432 Toulouse, France;
- Institute of Metabolic and Cardiovascular Diseases, University of Toulouse, UPS, 31062 Toulouse, France
| |
Collapse
|
16
|
Arumugam A, Samara SS, Shalash RJ, Qadah RM, Farhani AM, Alnajim HM, Alkalih HY. Does Google Fit provide valid energy expenditure measurements of functional tasks compared to those of Fibion accelerometer in healthy individuals? A cross-sectional study. Diabetes Metab Syndr 2021; 15:102301. [PMID: 34592530 DOI: 10.1016/j.dsx.2021.102301] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 09/13/2021] [Accepted: 09/21/2021] [Indexed: 11/21/2022]
Abstract
BACKGROUND AND AIMS Smartphone applications (e.g., Google Fit) may be a good alternative tool for accelerometers in estimating energy expenditure of physical activities because they are affordable, easy to use, and freely downloadable on smartphones. We aimed to determine the concurrent validity of the Fibion and Google Fit for measuring energy expenditure of functional tasks in healthy individuals. METHODS In this cross-sectional study, 28 healthy individuals (21.25 ± 1.84 years) performed certain tasks (lying, standing, 6-min walk test, treadmill walking, stair climbing and cycling) for ∼90 min, while wearing a Fibion accelerometer on their thigh and having the Google Fit application in a smartphone placed in their trouser pocket. Concurrent validity between the energy expenditure data of the Google Fit and Fibion was assessed using the Spearman rho correlation coefficient (data were not normally distributed), Bland-Altman plots and linear regression. RESULTS Neither energy expenditure for the whole duration nor for the tasks, except sitting + treadmill walking (r = 0.419, p = 0.027), showed significant correlations between the Google Fit and Fibion measurements. A proportional bias was evident for almost all comparisons. CONCLUSIONS The Google Fit did not provide valid energy expenditure measurements compared to the Fibion for most of the investigated tasks in healthy individuals.
Collapse
Affiliation(s)
- Ashokan Arumugam
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates; Neuromusculoskeletal Rehabilitation Research Group, RIMHS - Research Institute of Medical and Health Sciences, University of Sharjah, Sharjah, United Arab Emirates; Sustainable Engineering Asset Management Research Group, RISE - Research Institute of Sciences and Engineering, University of Sharjah, P.O.Box: 27272, Sharjah, United Arab Emirates; Adjunct Faculty, Department of Physiotherapy, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, Karnataka, India.
| | - Sara Sabri Samara
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| | - Reime Jamal Shalash
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| | - Raneen Mohammed Qadah
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| | - Amna Majid Farhani
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| | - Hawra Mohammed Alnajim
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| | - Hanan Youssef Alkalih
- Department of Physiotherapy, College of Health Sciences, University of Sharjah, P.O. Box: 27272, Sharjah, United Arab Emirates
| |
Collapse
|
17
|
Burns CS, Nix T, Shapiro RM, Huber JT. MEDLINE search retrieval issues: A longitudinal query analysis of five vendor platforms. PLoS One 2021; 16:e0234221. [PMID: 33956834 PMCID: PMC8101950 DOI: 10.1371/journal.pone.0234221] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 03/28/2021] [Indexed: 11/18/2022] Open
Abstract
This study compared the results of data collected from a longitudinal query analysis of the MEDLINE database hosted on multiple platforms that include PubMed, EBSCOHost, Ovid, ProQuest, and Web of Science. The goal was to identify variations among the search results on the platforms after controlling for search query syntax. We devised twenty-nine cases of search queries comprised of five semantically equivalent queries per case to search against the five MEDLINE database platforms. We ran our queries monthly for a year and collected search result count data to observe changes. We found that search results varied considerably depending on MEDLINE platform. Reasons for variations were due to trends in scholarly publication such as publishing individual papers online first versus complete issues. Some other reasons were metadata differences in bibliographic records; differences in the levels of specificity of search fields provided by the platforms and large fluctuations in monthly search results based on the same query. Database integrity and currency issues were observed as each platform updated its MEDLINE data throughout the year. Specific biomedical bibliographic databases are used to inform clinical decision-making, create systematic reviews, and construct knowledge bases for clinical decision support systems. They serve as essential information retrieval and discovery tools to help identify and collect research data and are used in a broad range of fields and as the basis of multiple research designs. This study should help clinicians, researchers, librarians, informationists, and others understand how these platforms differ and inform future work in their standardization.
Collapse
Affiliation(s)
- C. Sean Burns
- School of Information Science, University of Kentucky, Lexington, Kentucky, United States of America
- * E-mail:
| | - Tyler Nix
- Taubman Health Sciences Library, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Robert M. Shapiro
- Robert M. Fales Health Sciences Library - SEAHEC Medical Library, South East Area Health Education Center, Wilmington, North Carolina, United States of America
| | - Jeffrey T. Huber
- School of Information Science, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
18
|
Vaghela U, Rabinowicz S, Bratsos P, Martin G, Fritzilas E, Markar S, Purkayastha S, Stringer K, Singh H, Llewellyn C, Dutta D, Clarke JM, Howard M, Serban O, Kinross J. Using a Secure, Continually Updating, Web Source Processing Pipeline to Support the Real-Time Data Synthesis and Analysis of Scientific Literature: Development and Validation Study. J Med Internet Res 2021; 23:e25714. [PMID: 33835932 PMCID: PMC8104004 DOI: 10.2196/25714] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 12/30/2020] [Accepted: 04/03/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The scale and quality of the global scientific response to the COVID-19 pandemic have unquestionably saved lives. However, the COVID-19 pandemic has also triggered an unprecedented "infodemic"; the velocity and volume of data production have overwhelmed many key stakeholders such as clinicians and policy makers, as they have been unable to process structured and unstructured data for evidence-based decision making. Solutions that aim to alleviate this data synthesis-related challenge are unable to capture heterogeneous web data in real time for the production of concomitant answers and are not based on the high-quality information in responses to a free-text query. OBJECTIVE The main objective of this project is to build a generic, real-time, continuously updating curation platform that can support the data synthesis and analysis of a scientific literature framework. Our secondary objective is to validate this platform and the curation methodology for COVID-19-related medical literature by expanding the COVID-19 Open Research Dataset via the addition of new, unstructured data. METHODS To create an infrastructure that addresses our objectives, the PanSurg Collaborative at Imperial College London has developed a unique data pipeline based on a web crawler extraction methodology. This data pipeline uses a novel curation methodology that adopts a human-in-the-loop approach for the characterization of quality, relevance, and key evidence across a range of scientific literature sources. RESULTS REDASA (Realtime Data Synthesis and Analysis) is now one of the world's largest and most up-to-date sources of COVID-19-related evidence; it consists of 104,000 documents. By capturing curators' critical appraisal methodologies through the discrete labeling and rating of information, REDASA rapidly developed a foundational, pooled, data science data set of over 1400 articles in under 2 weeks. These articles provide COVID-19-related information and represent around 10% of all papers about COVID-19. CONCLUSIONS This data set can act as ground truth for the future implementation of a live, automated systematic review. The three benefits of REDASA's design are as follows: (1) it adopts a user-friendly, human-in-the-loop methodology by embedding an efficient, user-friendly curation platform into a natural language processing search engine; (2) it provides a curated data set in the JavaScript Object Notation format for experienced academic reviewers' critical appraisal choices and decision-making methodologies; and (3) due to the wide scope and depth of its web crawling method, REDASA has already captured one of the world's largest COVID-19-related data corpora for searches and curation.
Collapse
Affiliation(s)
- Uddhav Vaghela
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Simon Rabinowicz
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Paris Bratsos
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Guy Martin
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | | | - Sheraz Markar
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | - Sanjay Purkayastha
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | | | | | | | | | - Jonathan M Clarke
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| | | | - Ovidiu Serban
- Data Science Institute, Imperial College London, London, United Kingdom
| | - James Kinross
- PanSurg Collaborative, Department of Surgery and Cancer, Imperial College London, London, United Kingdom
| |
Collapse
|
19
|
Xue Y, Bao Y, Zhang Z, Zhao W, Xiao J, He S, Zhang G, Li Y, Zhao G, Chen R, Song S, Ma L, Zou D, Tian D, Li C, Zhu J, Gong Z, Chen M, Wang A, Ma Y, Li M, Teng X, Cui Y, Duan G, Zhang M, Jin T, Shi C, Du Z, Zhang Y, Liu C, Li R, Zeng J, Hao L, Jiang S, Chen H, Han D, Xiao J, Zhang Z, Zhao W, Xue Y, Bao Y, Zhang T, Kang W, Yang F, Qu J, Zhang W, Bao Y, Liu GH, Liu L, Zhang Y, Niu G, Zhu T, Feng C, Liu X, Zhang Y, Li Z, Chen R, Li Q, Teng X, Ma L, Hua Z, Tian D, Jiang C, Chen Z, He F, Zhao Y, Jin Y, Zhang Z, Huang L, Song S, Yuan Y, Zhou C, Xu Q, He S, Ye W, Cao R, Wang P, Ling Y, Yan X, Wang Q, Zhang G, Li Z, Liu L, Jiang S, Li Q, Feng C, Du Q, Ma L, Zong W, Kang H, Zhang M, Xiong Z, Li R, Huan W, Ling Y, Zhang S, Xia Q, Cao R, Fan X, Wang Z, Zhang G, Chen X, Chen T, Zhang S, Tang B, Zhu J, Dong L, Zhang Z, Wang Z, Kang H, Wang Y, Ma Y, Wu S, Kang H, Chen M, Li C, Tian D, Tang B, Liu X, Teng X, Song S, Tian D, Liu X, Li C, Teng X, Song S, Zhang Y, Zou D, Zhu T, Chen M, Niu G, Liu C, Xiong Y, Hao L, Niu G, Zou D, Zhu T, Shao X, Hao L, Li Y, Zhou H, Chen X, Zheng Y, Kang Q, Hao D, Zhang L, Luo H, Hao Y, Chen R, Zhang P, He S, Zou D, Zhang M, Xiong Z, Nie Z, Yu S, Li R, Li M, Li R, Bao Y, Xiong Z, Li M, Yang F, Ma Y, Sang J, Li Z, Li R, Tang B, Zhang X, Dong L, Zhou Q, Cui Y, Zhai S, Zhang Y, Wang G, Zhao W, Wang Z, Zhu Q, Li X, Zhu J, Tian D, Kang H, Li C, Zhang S, Song S, Li M, Zhao W, Yan J, Sang J, Zou D, Li C, Wang Z, Zhang Y, Zhu T, Song S, Wang X, Hao L, Liu Y, Wang Z, Luo H, Zhu J, Wu X, Tian D, Li C, Zhao W, Jing HC, Chen M, Zou D, Hao L, Zhao L, Wang J, Li Y, Song T, Zheng Y, Chen R, Zhao Y, He S, Zou D, Mehmood F, Ali S, Ali A, Saleem S, Hussain I, Abbasi AA, Ma L, Zou D, Zou D, Jiang S, Zhang Z, Jiang S, Zhao W, Xiao J, Bao Y, Zhang Z, Zuo Z, Ren J, Zhang X, Xiao Y, Li X, Zhang X, Xiao Y, Li X, Tu Y, Xue Y, Wu W, Ji P, Zhao F, Meng X, Chen M, Peng D, Xue Y, Luo H, Gao F, Zhang X, Xiao Y, Li X, Ning W, Xue Y, Lin S, Xue Y, Liu T, Guo AY, Yuan H, Zhang YE, Tan X, Xue Y, Zhang W, Xue Y, Xie Y, Ren J, Wang C, Xue Y, Liu CJ, Guo AY, Yang DC, Tian F, Gao G, Tang D, Xue Y, Yao L, Xue Y, Cui Q, An NA, Li CY, Luo X, Ren J, Zhang X, Xiao Y, Li X. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2021. Nucleic Acids Res 2021; 49:D18-D28. [PMID: 33175170 PMCID: PMC7779035 DOI: 10.1093/nar/gkaa1022] [Citation(s) in RCA: 135] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/13/2020] [Accepted: 10/16/2020] [Indexed: 12/20/2022] Open
Abstract
The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a suite of database resources to support worldwide research activities in both academia and industry. With the explosive growth of multi-omics data, CNCB-NGDC is continually expanding, updating and enriching its core database resources through big data deposition, integration and translation. In the past year, considerable efforts have been devoted to 2019nCoVR, a newly established resource providing a global landscape of SARS-CoV-2 genomic sequences, variants, and haplotypes, as well as Aging Atlas, BrainBase, GTDB (Glycosyltransferases Database), LncExpDB, and TransCirc (Translation potential for circular RNAs). Meanwhile, a series of resources have been updated and improved, including BioProject, BioSample, GWH (Genome Warehouse), GVM (Genome Variation Map), GEN (Gene Expression Nebulas) as well as several biodiversity and plant resources. Particularly, BIG Search, a scalable, one-stop, cross-database search engine, has been significantly updated by providing easy access to a large number of internal and external biological resources from CNCB-NGDC, our partners, EBI and NCBI. All of these resources along with their services are publicly accessible at https://bigd.big.ac.cn.
Collapse
|
20
|
Penlington M, Silverman H, Vasudevan A, Pavithran P. Plain Language Summaries of Clinical Trial Results: A Preliminary Study to Assess Availability of Easy-to-Understand Summaries and Approaches to Improving Public Engagement. Pharmaceut Med 2020; 34:401-406. [PMID: 33113147 PMCID: PMC7744300 DOI: 10.1007/s40290-020-00359-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2020] [Indexed: 11/30/2022]
Abstract
BACKGROUND Easy-to-understand, stand-alone factual summaries of clinical trial results have the potential to improve public understanding of and engagement with pharmaceutical research. The European Clinical Trial Regulation (EU) No. 536/2014 is a major regulatory initiative that will result in a large number of such plain language summaries (PLSs) posted in the public domain. Today, however, little is known about the extent to which PLSs are written and are available to the general public. OBJECTIVES This preliminary study assessed (i) 20 top pharmaceutical companies' positions on improving transparency and commitment to disclosing trial result summaries in an easy-to-understand format and (ii) the availability of such summaries in the public domain and the ease of locating them via general web searches. METHODS The availability of PLSs in the public domain was estimated based on the number of EudraCT technical result summaries in four disease areas: chronic obstructive pulmonary disease, asthma, meningitis, and influenza. The likelihood of PLSs being easy to find through internet search engine queries by members of the public was assessed using Google. RESULTS All 20 sponsors had committed to improve clinical trial transparency, 17 committed to sharing PLSs with trial participants, and 14 had at least one PLS available in the public domain. A total of 99 clinical studies in these four disease areas had technical summaries posted on EudraCT between 1 January 2017 and 30 June 2020. Of these 99, 14 studies had PLSs in the public domain. A total of 12 of 14 PLSs were directly captured by search engine. However, the sponsor trial identifier or EudraCT number had to be included in the search term to locate them. Generic search terms resulted in large volumes of non-relevant results. CONCLUSION Despite the progressive movement towards clinical trial transparency, easily accessible PLSs on clinical trials are currently scarce. The provision of a European mandate and framework for non-technical result summaries by Regulation (EU) 536/2014 will be a major step to bring about positive change.
Collapse
|
21
|
Asseo K, Fierro F, Slavutsky Y, Frasnelli J, Niv MY. Tracking COVID-19 using taste and smell loss Google searches is not a reliable strategy. Sci Rep 2020; 10:20527. [PMID: 33239650 PMCID: PMC7689487 DOI: 10.1038/s41598-020-77316-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Accepted: 10/29/2020] [Indexed: 02/06/2023] Open
Abstract
Web search tools are widely used by the general public to obtain health-related information, and analysis of search data is often suggested for public health monitoring. We analyzed popularity of searches related to smell loss and taste loss, recently listed as symptoms of COVID-19. Searches on sight loss and hearing loss, which are not considered as COVID-19 symptoms, were used as control. Google Trends results per region in Italy or state in the US were compared to COVID-19 incidence in the corresponding geographical areas. The COVID-19 incidence did not correlate with searches for non-symptoms, but in some weeks had high correlation with taste and smell loss searches, which also correlated with each other. Correlation of the sensory symptoms with new COVID-19 cases for each country as a whole was high at some time points, but decreased (Italy) or dramatically fluctuated over time (US). Smell loss searches correlated with the incidence of media reports in the US. Our results show that popularity of symptom searches is not reliable for pandemic monitoring. Awareness of this limitation is important during the COVID-19 pandemic, which continues to spread and to exhibit new clinical manifestations, and for potential future health threats.
Collapse
Affiliation(s)
- Kim Asseo
- The Institute of Biochemistry, Food Science and Nutrition, The Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Fabrizio Fierro
- The Institute of Biochemistry, Food Science and Nutrition, The Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Yuli Slavutsky
- Department of Statistics and Data Science, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Johannes Frasnelli
- Department of Anatomy, University of Québec in Trois-Rivières, Trois-Rivières, QC, Canada
| | - Masha Y Niv
- The Institute of Biochemistry, Food Science and Nutrition, The Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel.
| |
Collapse
|
22
|
Cheng L, Zhou X, Wang F, Xiao L. A State-Level Analysis of Mortality and Google Searches for Pornography: Insight from Life History Theory. Arch Sex Behav 2020; 49:3005-3011. [PMID: 32601838 DOI: 10.1007/s10508-020-01765-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 06/04/2020] [Accepted: 06/06/2020] [Indexed: 06/11/2023]
Abstract
Due to the widespread popularity of pornography, some studies explored which individual factors are associated with the frequency of pornography use. However, knowledge about the relationship between socioecological environment and pornography consumption remains scant. Based on life history theory, the current research investigated the association between state-level mortality and search interest for pornography using Google trends. We observed that, in the U.S., the higher mortality or violent crime rate in a state, the stronger search interest for pornography on Google. The results expand the literature regarding the relationship between socioecological environment and individuals' online sexual behavior at the state level.
Collapse
Affiliation(s)
- Lei Cheng
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing, China
- School of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Xuan Zhou
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing, China
- School of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Fang Wang
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing, China.
- School of Psychology, Beijing Normal University, Beijing, 100875, China.
| | - Lijuan Xiao
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing, China
- School of Psychology, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
23
|
Birnbaum ML, Wen H, Van Meter A, Ernala SK, Rizvi AF, Arenare E, Estrin D, De Choudhury M, Kane JM. Identifying emerging mental illness utilizing search engine activity: A feasibility study. PLoS One 2020; 15:e0240820. [PMID: 33064759 PMCID: PMC7567375 DOI: 10.1371/journal.pone.0240820] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 10/04/2020] [Indexed: 11/18/2022] Open
Abstract
Mental illness often emerges during the formative years of adolescence and young adult development and interferes with the establishment of healthy educational, vocational, and social foundations. Despite the severity of symptoms and decline in functioning, the time between illness onset and receiving appropriate care can be lengthy. A method by which to objectively identify early signs of emerging psychiatric symptoms could improve early intervention strategies. We analyzed a total of 405,523 search queries from 105 individuals with schizophrenia spectrum disorders (SSD, N = 36), non-psychotic mood disorders (MD, N = 38) and healthy volunteers (HV, N = 31) utilizing one year's worth of data prior to the first psychiatric hospitalization. Across 52 weeks, we found significant differences in the timing (p<0.05) and frequency (p<0.001) of searches between individuals with SSD and MD compared to HV up to a year in advance of the first psychiatric hospitalization. We additionally identified significant linguistic differences in search content among the three groups including use of words related to sadness and perception, use of first and second person pronouns, and use of punctuation (all p<0.05). In the weeks before hospitalization, both participants with SSD and MD displayed significant shifts in search timing (p<0.05), and participants with SSD displayed significant shifts in search content (p<0.05). Our findings demonstrate promise for utilizing personal patterns of online search activity to inform clinical care.
Collapse
Affiliation(s)
- Michael L. Birnbaum
- The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States of America
- The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States of America
- * E-mail:
| | - Hongyi Wen
- Cornell Tech, Cornell University, New York, NY, United States of America
| | - Anna Van Meter
- The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States of America
- The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States of America
| | - Sindhu K. Ernala
- Georgia Institute of Technology, Atlanta, GA, United States of America
| | - Asra F. Rizvi
- The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States of America
- The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States of America
| | - Elizabeth Arenare
- The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States of America
- The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States of America
| | - Deborah Estrin
- Cornell Tech, Cornell University, New York, NY, United States of America
| | | | - John M. Kane
- The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States of America
- The Feinstein Institute for Medical Research, Manhasset, NY, United States of America
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States of America
| |
Collapse
|
24
|
Pawar AS, Nagpal S, Pawar N, Lerman LO, Eirin A. General Public's Information-Seeking Patterns of Topics Related to Obesity: Google Trends Analysis. JMIR Public Health Surveill 2020; 6:e20923. [PMID: 32633725 PMCID: PMC7448178 DOI: 10.2196/20923] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 06/25/2020] [Accepted: 07/07/2020] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Obesity is a major public health challenge, and recent literature sheds light on the concept of "normalization" of obesity. OBJECTIVE We aimed to study the worldwide pattern of web-based information seeking by public on obesity and on its related terms and topics using Google Trends. METHODS We compared the relative frequency of obesity-related search terms and topics between 2004 and 2019 on Google Trends. The mean relative interest scores for these terms over the 4-year quartiles were compared. RESULTS The mean relative interest score of the search term "obesity" consistently decreased with time in all four quartiles (2004-2019), whereas the relative interest scores of the search topics "weight loss" and "abdominal obesity" increased. The topic "weight loss" was popular during the month of January, and its median relative interest score for January was higher than that for other months for the entire study period (P<.001). The relative interest score for the search term "obese" decreased over time, whereas those scores for the terms "body positivity" and "self-love" increased after 2013. CONCLUSIONS Despite a worldwide increase in the prevalence of obesity, its popularity as an internet search term diminished over time. The reason for peaks in months should be explored and applied to the awareness campaigns for better effectiveness. These patterns suggest normalization of obesity in society and a rise of public curiosity about image-related obesity rather than its medical implications and harm.
Collapse
Affiliation(s)
- Aditya S Pawar
- Division of Nephrology, Mayo Clinic, Rochester, MN, United States
| | - Sajan Nagpal
- Divison of Gastroenterology, University of Chicago, Chicago, IL, United States
| | - Neha Pawar
- Department of Anesthesiology and Perioperative Medicine, University of Rochester Medical Center, Rochester, NY, United States
| | - Lilach O Lerman
- Division of Nephrology, Mayo Clinic, Rochester, MN, United States
| | - Alfonso Eirin
- Division of Nephrology, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
25
|
Agten A, Van Houtven J, Askenazi M, Burzykowski T, Laukens K, Valkenborg D. Visualizing the agreement of peptide assignments between different search engines. J Mass Spectrom 2020; 55:e4471. [PMID: 31713933 DOI: 10.1002/jms.4471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 10/23/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
There is a trend in the analysis of shotgun proteomics data that aims to combine information from multiple search engines to increase the number of peptide annotations in an experiment. Typically, the degree of search engine complementarity and search engine agreement is visually illustrated by means of Venn diagrams that present the findings of a database search on the level of the nonredundant peptide annotations. We argue this practice to be not fit-for-purpose since the diagrams do not take into account and often conceal the information on complementarity and agreement at the level of the spectrum identification. We promote a new type of visualization that provides insight on the peptide sequence agreement at the level of the peptide-spectrum match (PSM) as a measure of consensus between two search engines with nominal outcomes. We applied the visualizations and percentage sequence agreement to an in-house data set of our benchmark organism, Caenorhabditis elegans, and illustrated that when assessing the agreement between search engine, one should disentangle the notion of PSM confidence and PSM identity. The visualizations presented in this manuscript provide a more informative assessment of pairs of search engines and are made available as an R function in the Supporting Information.
Collapse
Affiliation(s)
- Annelies Agten
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
| | - Joris Van Houtven
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
- UA-VITO Center for Proteomics, University of Antwerp, Antwerp, Belgium
- Applied Bio and Molecular Systems, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | | | - Tomasz Burzykowski
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Dirk Valkenborg
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
- UA-VITO Center for Proteomics, University of Antwerp, Antwerp, Belgium
- Applied Bio and Molecular Systems, Flemish Institute for Technological Research (VITO), Mol, Belgium
| |
Collapse
|
26
|
Abstract
INTRODUCTION The Internet is a widely used resource for obtaining medical information. However, the quality of information on online platforms is still debated. Our goal in this quality-controlled WebSurg® and YouTube®-based study was to compare these two online video platforms in terms of the accuracy and quality of information about sleeve gastrectomy videos. METHODS Most viewed (popular) videos returned by YouTube® search engine in response to the keyword "sleeve gastrectomy" were included in the study. The educational accuracy and quality of the videos were evaluated according to known scoring systems. A novel scoring system measured technical quality. The ten most viewed (popular) videos in WebSurg® in response to the keyword "sleeve gastrectomy" were compared with ten YouTube® videos with the highest educational/technical scores. RESULTS Scoring systems measuring the educational accuracy and quality of WebSurg® videos were significantly higher than ten YouTube® videos which have the most top technical scores (p < 0.05), and no significant difference was found in the assessment of ten YouTube® videos that have the highest technical ratings compared with WebSurg® videos (p 0.481). CONCLUSIONS WebSurg® videos, which were passed through a reviewing process and were mostly prepared by academicians, remained below the expected quality. The main limitation of WebSurg® and YouTube® is the lack of information on preoperative and postoperative processes.
Collapse
Affiliation(s)
- Murat Ferhat Ferhatoglu
- Faculty of Medicine, Department of General Surgery, Okan University, Aydinli Yolu Caddesi, Istanbul, Turkey.
| | - Abdulcabbar Kartal
- Faculty of Medicine, Department of General Surgery, Okan University, Aydinli Yolu Caddesi, Istanbul, Turkey
| | - Ali İlker Filiz
- Faculty of Medicine, Department of General Surgery, Okan University, Aydinli Yolu Caddesi, Istanbul, Turkey
| | - Abut Kebudi
- Faculty of Medicine, Department of General Surgery, Okan University, Aydinli Yolu Caddesi, Istanbul, Turkey
| |
Collapse
|
27
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. Mass Spectrom Rev 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
28
|
Abstract
BACKGROUND Post-database search is a key procedure in peptide identification with tandem mass spectrometry (MS/MS) strategies for refining peptide-spectrum matches (PSMs) generated by database search engines. Although many statistical and machine learning-based methods have been developed to improve the accuracy of peptide identification, the challenge remains on large-scale datasets and datasets with a distribution of unbalanced PSMs. A more efficient learning strategy is required for improving the accuracy of peptide identification on challenging datasets. While complex learning models have larger power of classification, they may cause overfitting problems and introduce computational complexity on large-scale datasets. Kernel methods map data from the sample space to high dimensional spaces where data relationships can be simplified for modeling. RESULTS In order to tackle the computational challenge of using the kernel-based learning model for practical peptide identification problems, we present an online learning algorithm, OLCS-Ranker, which iteratively feeds only one training sample into the learning model at each round, and, as a result, the memory requirement for computation is significantly reduced. Meanwhile, we propose a cost-sensitive learning model for OLCS-Ranker by using a larger loss of decoy PSMs than that of target PSMs in the loss function. CONCLUSIONS The new model can reduce its false discovery rate on datasets with a distribution of unbalanced PSMs. Experimental studies show that OLCS-Ranker outperforms other methods in terms of accuracy and stability, especially on datasets with a distribution of unbalanced PSMs. Furthermore, OLCS-Ranker is 15-85 times faster than CRanker.
Collapse
Affiliation(s)
- Xijun Liang
- College of Science, China University of Petroleum, Changjiang West Road, Qingdao, 266580 China
| | - Zhonghang Xia
- School of Engineering and Applied Science, Western Kentucky University, Bowling Green, 42101 KY USA
| | - Ling Jian
- School of Economics and Management, China University of Petroleum, Changjiang West Road, Qingdao, 266580 China
| | - Yongxiang Wang
- College of Science, China University of Petroleum, Changjiang West Road, Qingdao, 266580 China
| | - Xinnan Niu
- Dept. of Pathology, Microbiology and Immunology, Vanderbilt University School of Medicine, Nashville, 37232 TN USA
| | - Andrew J. Link
- Dept. of Pathology, Microbiology and Immunology, Vanderbilt University School of Medicine, Nashville, 37232 TN USA
| |
Collapse
|
29
|
Chen T, Gentry S, Qiu D, Deng Y, Notley C, Cheng G, Song F. Online Information on Electronic Cigarettes: Comparative Study of Relevant Websites From Baidu and Google Search Engines. J Med Internet Res 2020; 22:e14725. [PMID: 32012069 PMCID: PMC7007591 DOI: 10.2196/14725] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 10/16/2019] [Accepted: 12/19/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Online information on electronic cigarettes (e-cigarettes) may influence people's perception and use of e-cigarettes. Websites with information on e-cigarettes in the Chinese language have not been systematically assessed. OBJECTIVE The aim of this study was to assess and compare the types and credibility of Web-based information on e-cigarettes identified from Google (in English) and Baidu (in Chinese) search engines. METHODS We used the keywords vaping or e-cigarettes to conduct a search on Google and the equivalent Chinese characters for Baidu. The first 50 unique and relevant websites from each of the two search engines were included in this analysis. The main characteristics of the websites, credibility of the websites, and claims made on the included websites were systematically assessed and compared. RESULTS Compared with websites on Google, more websites on Baidu were owned by manufacturers or retailers (15/50, 30% vs 33/50, 66%; P<.001). None of the Baidu websites, compared to 24% (12/50) of Google websites, were provided by public or health professional institutions. The Baidu websites were more likely to contain e-cigarette advertising (P<.001) and less likely to provide information on health education (P<.001). The overall credibility of the included Baidu websites was lower than that of the Google websites (P<.001). An age restriction warning was shown on all advertising websites from Google (15/15) but only on 10 of the 33 (30%) advertising websites from Baidu (P<.001). Conflicting or unclear health and social claims were common on the included websites. CONCLUSIONS Although conflicting or unclear claims on e-cigarettes were common on websites from both Baidu and Google search engines, there was a lack of online information from public health authorities in China. Unbiased information and evidence-based recommendations on e-cigarettes should be provided by public health authorities to help the public make informed decisions regarding the use of e-cigarettes.
Collapse
Affiliation(s)
- Ting Chen
- School of Public Health, Hubei Provincial Key Laboratory of Occupational Hazard Identification & Control, Wuhan University of Science & Technology, Wuhan, China
| | - Sarah Gentry
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| | - Dechao Qiu
- School of Public Health, Hubei Provincial Key Laboratory of Occupational Hazard Identification & Control, Wuhan University of Science & Technology, Wuhan, China
| | - Yan Deng
- School of Public Health, Hubei Provincial Key Laboratory of Occupational Hazard Identification & Control, Wuhan University of Science & Technology, Wuhan, China
| | - Caitlin Notley
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| | - Guangwen Cheng
- School of Public Health, Hubei Provincial Key Laboratory of Occupational Hazard Identification & Control, Wuhan University of Science & Technology, Wuhan, China
| | - Fujian Song
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| |
Collapse
|
30
|
Abstract
The VAST+ algorithm is an efficient, simple, and elegant solution to the problem of comparing the atomic structures of biological assemblies. Given two protein assemblies, it takes as input all the pairwise structural alignments of the component proteins. It then clusters the rotation matrices from the pairwise superpositions, with the clusters corresponding to subsets of the two assemblies that may be aligned and well superposed. It uses the Vector Alignment Search Tool (VAST) protein-protein comparison method for the input structural alignments, but other methods could be used, as well. From a chosen cluster, an "original" alignment for the assembly may be defined by simply combining the relevant input alignments. However, it is often useful to reduce/trim the original alignment, using a Monte Carlo refinement algorithm, which allows biologically relevant conformational differences to be more readily detected and observed. The method is easily extended to include RNA or DNA molecules. VAST+ results may be accessed via the URL https://www.ncbi.nlm.nih.gov/Structure , then entering a PDB accession or terms in the search box, and using the link [VAST+] in the upper right corner of the Structure Summary page.
Collapse
Affiliation(s)
- Thomas Madej
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Christopher Lanczycki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Dachuan Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Stephen H Bryant
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
31
|
Gillum S, Williams N, Brink B, Ross E. Clinician Job Searches in the Internet Era: Internet-Based Study. J Med Internet Res 2019; 21:e12638. [PMID: 31278735 PMCID: PMC6640069 DOI: 10.2196/12638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 05/20/2019] [Accepted: 05/20/2019] [Indexed: 11/13/2022] Open
Abstract
Background Traditional methods using print media and commercial firms for clinician recruiting are often limited by cost, slow pace, and suboptimal results. An efficient and fiscally sound approach is needed for searching online to recruit clinicians. Objective The aim of the study was to assess the Web-based methods by which clinicians might be searching for jobs in a broad range of specialties and how academic medical centers can advertise clinical job openings to prominently appear on internet searches that would yield the greatest return on investment. Methods We used a search engine (Google) to identify 8 query terms for each of the specialties and specialists (eg, dermatology and dermatologist) to determine internet job search methodologies for 12 clinical disciplines. Searches were conducted, and the data used for analysis were the first 20 results. Results In total, 176 searches were conducted at varying times over the course of several months, and 3520 results were recorded. The following 4 types of websites appeared in the top 10 search results across all specialties searched, accounting for 52.27% (920/1760) of the results: (1) a single no-cost job aggregator (229/1760, 13.01%); (2) 2 prominent journal-based paid digital job listing services (157/1760, 8.92% and 91/1760, 5.17%, respectively); (3) a fee-based Web-based agency (137/1760, 7.78%) offering candidate profiles; and (4) society-based paid advertisements (totaling 306/1760, 17.38%). These sites accounted for 75.45% (664/880) of results limited to the top 5 results. Repetitive short-term testing yielded similar results with minor changes in the rank order. Conclusions On the basis of our findings, we offer a specific financially prudent internet strategy for both clinicians searching the internet for employment and employers hiring clinicians in academic medical centers.
Collapse
Affiliation(s)
- Shalu Gillum
- College of Medicine, University of Central Florida, Orlando, FL, United States
| | - Natasha Williams
- College of Medicine, University of Central Florida, Orlando, FL, United States
| | - Brittany Brink
- College of Medicine, University of Central Florida, Orlando, FL, United States
| | - Edward Ross
- College of Medicine, University of Central Florida, Orlando, FL, United States
| |
Collapse
|
32
|
Alzu'bi AA, Zhou L, Watzlaf VJM. Genetic Variations and Precision Medicine. Perspect Health Inf Manag 2019; 16:1a. [PMID: 31019429 PMCID: PMC6462879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The time and costs associated with the sequencing of a human genome have decreased significantly in recent years. Many people have chosen to have their genomes sequenced to receive genomics-based personalized healthcare services. To reach the goal of genomics-based precision medicine, health information management (HIM) professionals need to manage and analyze patients' genomic data. Two important pieces of information from the genome sequence are the risk of genetic diseases and the specific medication or pharmacogenomic results for the individual patient, both of which are linked to a patient's genetic variations. In this review article, we introduce genetic variations, including their data types, relevant databases, and some currently available analysis methods and systems. HIM professionals can choose to use these databases, methods, and systems in the management and analysis of patients' genomic data.
Collapse
Affiliation(s)
- Amal Adel Alzu'bi
- The Department of Computer Information Systems at Jordan University of Science and Technology in Irbid, Jordan
| | - Leming Zhou
- The Department of Health Information Management at the University of Pittsburgh in Pittsburgh, PA
| | - Valerie J M Watzlaf
- The Department of Health Information Management at the University of Pittsburgh in Pittsburgh, PA
| |
Collapse
|
33
|
Helmers L, Horn F, Biegler F, Oppermann T, Müller KR. Automating the search for a patent's prior art with a full text similarity search. PLoS One 2019; 14:e0212103. [PMID: 30830911 PMCID: PMC6398827 DOI: 10.1371/journal.pone.0212103] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 01/28/2019] [Indexed: 11/18/2022] Open
Abstract
More than ever, technical inventions are the symbol of our society’s advance. Patents guarantee their creators protection against infringement. For an invention being patentable, its novelty and inventiveness have to be assessed. Therefore, a search for published work that describes similar inventions to a given patent application needs to be performed. Currently, this so-called search for prior art is executed with semi-automatically composed keyword queries, which is not only time consuming, but also prone to errors. In particular, errors may systematically arise by the fact that different keywords for the same technical concepts may exist across disciplines. In this paper, a novel approach is proposed, where the full text of a given patent application is compared to existing patents using machine learning and natural language processing techniques to automatically detect inventions that are similar to the one described in the submitted document. Various state-of-the-art approaches for feature extraction and document comparison are evaluated. In addition to that, the quality of the current search process is assessed based on ratings of a domain expert. The evaluation results show that our automated approach, besides accelerating the search process, also improves the search results for prior art with respect to their quality.
Collapse
Affiliation(s)
- Lea Helmers
- Machine Learning Group, Technische Universität Berlin, Berlin, Germany
| | - Franziska Horn
- Machine Learning Group, Technische Universität Berlin, Berlin, Germany
- * E-mail: (FH); (KRM)
| | | | | | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Berlin, Germany
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max-Planck-Institut für Informatik, Saarbrücken, Germany
- * E-mail: (FH); (KRM)
| |
Collapse
|
34
|
Ćurković M, Košec A. Bubble effect: including internet search engines in systematic reviews introduces selection bias and impedes scientific reproducibility. BMC Med Res Methodol 2018; 18:130. [PMID: 30424741 PMCID: PMC6234590 DOI: 10.1186/s12874-018-0599-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Accepted: 10/30/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Using internet search engines (such as Google search) in systematic literature reviews is increasingly becoming a ubiquitous part of search methodology. In order to integrate the vast quantity of available knowledge, literature mostly focuses on systematic reviews, considered to be principal sources of scientific evidence at all practical levels. Any possible individual methodological flaws present in these systematic reviews have the potential to become systemic. MAIN TEXT This particular bias, that could be referred to as (re)search bubble effect, is introduced because of inherent, personalized nature of internet search engines that tailors results according to derived user preferences based on unreproducible criteria. In other words, internet search engines adjust their user's beliefs and attitudes, leading to the creation of a personalized (re)search bubble, including entries that have not been subjected to rigorous peer review process. The internet search engine algorithms are in a state of constant flux, producing differing results at any given moment, even if the query remains identical. There are many more subtle ways of introducing unwanted variations and synonyms of search queries that are used autonomously, detached from user insight and intent. Even the most well-known and respected systematic literature reviews do not seem immune to the negative implications of the search bubble effect, affecting reproducibility. CONCLUSION Although immensely useful and justified by the need for encompassing the entirety of knowledge, the practice of including internet search engines in systematic literature reviews is fundamentally irreconcilable with recent emphasis on scientific reproducibility and rigor, having a profound impact on the discussion of the limits of scientific epistemology. Scientific research that is not reproducible, may still be called science, but represents one that should be avoided. Our recommendation is to use internet search engines as an additional literature source, primarily in order to validate initial search strategies centered on bibliographic databases.
Collapse
Affiliation(s)
- Marko Ćurković
- University Psychiatric Hospital Vrapče, Bolnička cesta 32, Zagreb, Croatia
| | - Andro Košec
- Department of Otorhinolaryngology and Head and Neck Surgery, University Hospital Center Sestre milosrdnice, Vinogradska cesta 29, Zagreb, Croatia
| |
Collapse
|
35
|
Bikbov B, Perico N, Remuzzi G. A comparison of metrics and performance characteristics of different search strategies for article retrieval for a systematic review of the global epidemiology of kidney and urinary diseases. BMC Med Res Methodol 2018; 18:110. [PMID: 30340535 PMCID: PMC6194627 DOI: 10.1186/s12874-018-0569-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 10/04/2018] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Conducting a systematic review requires a comprehensive bibliographic search. Comparing different search strategies is essential for choosing those that cover all useful data sources. Our aim was to develop search strategies for article retrieval for a systematic review of the global epidemiology of kidney and urinary diseases, and evaluate their metrics and performance characteristics that could be useful for other systematic epidemiologic reviews. METHODS We described the methodological framework and analysed approaches applied in the previously conducted systematic review intended to obtain published data for global estimates of the kidney and urinary disease burden. We used several search strategies in PubMed and EMBASE, and compared several metrics: number needed to retrieve (NNR), number of extracted data rows, number of covered countries, and when appropriate, sensitivity, specificity, precision, and accuracy. RESULTS The initial search obtained 29,460 records from PubMed, and 4247 from EMBASE. After the revision, the full text of 381 and 14 articles respectively was obtained for data extraction (the percentage of useful records is 1.3% for PubMed, 0.3% for EMBASE). For PubMed we developed two search strategies and compared them with a 'gold standard' formed by merging their results: free word search strategy (FreeWoSS) was based on the search for keywords in all fields, and subject headings based search strategy (SuHeSS) used only MeSH-mapped conditions and countries names. SuHeSS excluded almost 15% of useful articles and data rows extracted from them, but had a lower NNR of 40 and higher specificity. FreeWoSS had better sensitivity and was able to cover the vast majority of articles and extracted data rows, but had a higher NNR of 65. CONCLUSIONS The sensitive FreeWoSS strategy provides more data for modelling, while the more specific SuHeSS strategy could be used when resources are limited. EMBASE has limited value for our systematic review.
Collapse
Affiliation(s)
- Boris Bikbov
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via G.-B. Camozzi 3 –, 24020 Bergamo, Ranica Italy
| | - Norberto Perico
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via G.-B. Camozzi 3 –, 24020 Bergamo, Ranica Italy
| | - Giuseppe Remuzzi
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via G.-B. Camozzi 3 –, 24020 Bergamo, Ranica Italy
- Unit of Nephrology, Dialysis and Transplantation, Azienda Socio-Sanitaria Territoriale Papa Giovanni XXIII, Bergamo, Italy
- L. Sacco Department of Biomedical and Clinical Sciences, University of Milan, Milan, Italy
| |
Collapse
|
36
|
Abstract
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. We present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared with SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Pieter Meysman
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
37
|
Wagner M, Lampos V, Cox IJ, Pebody R. The added value of online user-generated content in traditional methods for influenza surveillance. Sci Rep 2018; 8:13963. [PMID: 30228285 PMCID: PMC6143510 DOI: 10.1038/s41598-018-32029-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 08/28/2018] [Indexed: 11/09/2022] Open
Abstract
There has been considerable work in evaluating the efficacy of using online data for health surveillance. Often comparisons with baseline data involve various squared error and correlation metrics. While useful, these overlook a variety of other factors important to public health bodies considering the adoption of such methods. In this paper, a proposed surveillance system that incorporates models based on recent research efforts is evaluated in terms of its added value for influenza surveillance at Public Health England. The system comprises of two supervised learning approaches trained on influenza-like illness (ILI) rates provided by the Royal College of General Practitioners (RCGP) and produces ILI estimates using Twitter posts or Google search queries. RCGP ILI rates for different age groups and laboratory confirmed cases by influenza type are used to evaluate the models with a particular focus on predicting the onset, overall intensity, peak activity and duration of the 2015/16 influenza season. We show that the Twitter-based models perform poorly and hypothesise that this is mostly due to the sparsity of the data available and a limited training period. Conversely, the Google-based model provides accurate estimates with timeliness of approximately one week and has the potential to complement current surveillance systems.
Collapse
Affiliation(s)
- Moritz Wagner
- Public Health England, London, UK.
- University College London, London, United Kingdom.
- London School of Hygiene and Tropical Medicine, London, United Kingdom.
| | - Vasileios Lampos
- Department of Computer Science, University College London, London, UK
| | - Ingemar J Cox
- Department of Computer Science, University College London, London, UK
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
38
|
Abstract
The minimization of open stacks problem (MOSP) aims to determine the ideal production sequence to optimize the occupation of physical space in manufacturing settings. Most of current methods for solving the MOSP were not designed to work with large instances, precluding their use in specific cases of similar modeling problems. We therefore propose a PageRank-based heuristic to solve large instances modeled in graphs. In computational experiments, both data from the literature and new datasets up to 25 times fold larger in input size than current datasets, totaling 1330 instances, were analyzed to compare the proposed heuristic with state-of-the-art methods. The results showed the competitiveness of the proposed heuristic in terms of quality, as it found optimal solutions in several cases, and in terms of shorter run times compared with the fastest available method. Furthermore, based on specific graph densities, we found that the difference in the value of solutions between methods was small, thus justifying the use of the fastest method. The proposed heuristic is scalable and is more affected by graph density than by size.
Collapse
Affiliation(s)
| | | | - Nei Yoshihiro Soma
- Technological Institute of Aeronautics, Computer Sciences Division, São José dos Campos, São Paulo, 12228-900, Brazil
| |
Collapse
|
39
|
Richardson ML, Amini B. Teaching Radiology Physics Interactively with Scientific Notebook Software. Acad Radiol 2018; 25:801-810. [PMID: 29751860 DOI: 10.1016/j.acra.2017.11.024] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 11/01/2017] [Accepted: 11/04/2017] [Indexed: 11/30/2022]
Abstract
RATIONALE AND OBJECTIVES The goal of this study is to demonstrate how the teaching of radiology physics can be enhanced with the use of interactive scientific notebook software. METHODS We used the scientific notebook software known as Project Jupyter, which is free, open-source, and available for the Macintosh, Windows, and Linux operating systems. RESULTS We have created a scientific notebook that demonstrates multiple interactive teaching modules we have written for our residents using the Jupyter notebook system. CONCLUSIONS Scientific notebook software allows educators to create teaching modules in a form that combines text, graphics, images, data, interactive calculations, and image analysis within a single document. These notebooks can be used to build interactive teaching modules, which can help explain complex topics in imaging physics to residents.
Collapse
Affiliation(s)
- Michael L Richardson
- Department of Radiology, University of Washington, 4245 Roosevelt Way NE, Seattle, WA 98105.
| | - Behrang Amini
- Department of Radiology, M. D. Anderson Cancer Center, Houston, Texas
| |
Collapse
|
40
|
Hernando L, Mendiburu A, Lozano JA. Anatomy of the Attraction Basins: Breaking with the Intuition. Evol Comput 2018; 27:435-466. [PMID: 29786459 DOI: 10.1162/evco_a_00227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Solving combinatorial optimization problems efficiently requires the development of algorithms that consider the specific properties of the problems. In this sense, local search algorithms are designed over a neighborhood structure that partially accounts for these properties. Considering a neighborhood, the space is usually interpreted as a natural landscape, with valleys and mountains. Under this perception, it is commonly believed that, if maximizing, the solutions located in the slopes of the same mountain belong to the same attraction basin, with the peaks of the mountains being the local optima. Unfortunately, this is a widespread erroneous visualization of a combinatorial landscape. Thus, our aim is to clarify this aspect, providing a detailed analysis of, first, the existence of plateaus where the local optima are involved, and second, the properties that define the topology of the attraction basins, picturing a reliable visualization of the landscapes. Some of the features explored in this article have never been examined before. Hence, new findings about the structure of the attraction basins are shown. The study is focused on instances of permutation-based combinatorial optimization problems considering the 2-exchange and the insert neighborhoods. As a consequence of this work, we break away from the extended belief about the anatomy of attraction basins.
Collapse
Affiliation(s)
- Leticia Hernando
- Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV/EHU, 20018 San Sebastián, Spain
| | - Alexander Mendiburu
- Intelligent Systems Group, Department of Computer Architecture and Technology, University of the Basque Country UPV/EHU, 20018 San Sebastián, Spain
| | - Jose A Lozano
- Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV/EHU, 20018 San Sebastián, Spain Basque Center for Applied Mathematics (BCAM), 48009 Bilbao, Spain
| |
Collapse
|
41
|
Keyhani S, Vali M, Cohen B, Woodbridge A, Arenson M, Eilkhani E, Aivadyan C, Hasin D. A search algorithm for identifying likely users and non-users of marijuana from the free text of the electronic medical record. PLoS One 2018; 13:e0193706. [PMID: 29509775 PMCID: PMC5839555 DOI: 10.1371/journal.pone.0193706] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 02/19/2018] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The harmful effects of marijuana on health and in particular cardiovascular health are understudied. To develop such knowledge, an efficient method of developing an informative cohort of marijuana users and non-users is needed. METHODS We identified patients with a diagnosis of coronary artery disease using ICD-9 codes who were seen in the San Francisco VA in 2015. We imported these patients' medical record notes into an informatics platform that facilitated text searches. We categorized patients into those with evidence of marijuana use in the past 12 months and patients with no such evidence, using the following text strings: "marijuana", "mjx", and "cannabis". We randomly selected 51 users and 51 non-users based on this preliminary classification, and sent a recruitment letter to 97 of these patients who had contact information available. Patients were interviewed on marijuana use and domains related to cardiovascular health. Data on marijuana use collected from the medical record was compared to data collected as part of the interview. RESULTS The interview completion rate was 71%. Among the 35 patients identified by text strings as having used marijuana in the previous year, 15 had used marijuana in the past 30 days (positive predictive value = 42.9%). The probability of use in the past month increased from 8.8% to 42.9% in people who have these keywords in their medical record compared to those who did not have these terms in their medical record. CONCLUSION Methods that combine text search strategies for participant recruitment with health interviews provide an efficient approach to developing prospective cohorts that can be used to study the health effects of marijuana.
Collapse
Affiliation(s)
- Salomeh Keyhani
- San Francisco VA Medical Center, San Francisco, CA, United States of America
- University of California San Francisco, Department of Medicine, San Francisco, CA, United States of America
- * E-mail:
| | - Marzieh Vali
- San Francisco VA Medical Center, San Francisco, CA, United States of America
| | - Beth Cohen
- San Francisco VA Medical Center, San Francisco, CA, United States of America
- University of California San Francisco, Department of Medicine, San Francisco, CA, United States of America
| | - Alexandra Woodbridge
- Tulane University School of Medicine, New Orleans, Louisiana, United States of America
| | - Melanie Arenson
- University of Maryland, Department of Psychology, College Park, Maryland, United States of America
| | - Elnaz Eilkhani
- University of California San Francisco, Department of Medicine, San Francisco, CA, United States of America
| | - Christina Aivadyan
- New York State Psychiatric Institute, New York, NY, United States of America
| | - Deborah Hasin
- New York State Psychiatric Institute, New York, NY, United States of America
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| |
Collapse
|
42
|
Volpato EDSN, Betini M, Puga ME, Agarwal A, Cataneo AJM, de Oliveira LD, Bazan R, Braz LG, Pereira JEG, Dib RE. Strategies to optimize MEDLINE and EMBASE search strategies for anesthesiology systematic reviews. An experimental study. SAO PAULO MED J 2018; 136:103-108. [PMID: 29340504 PMCID: PMC9879554 DOI: 10.1590/1516-3180.2017.0277100917] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 09/10/2017] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND A high-quality electronic search is essential for ensuring accuracy and comprehensiveness among the records retrieved when conducting systematic reviews. Therefore, we aimed to identify the most efficient method for searching in both MEDLINE (through PubMed) and EMBASE, covering search terms with variant spellings, direct and indirect orders, and associations with MeSH and EMTREE terms (or lack thereof). DESIGN AND SETTING Experimental study. UNESP, Brazil. METHODS We selected and analyzed 37 search strategies that had specifically been developed for the field of anesthesiology. These search strategies were adapted in order to cover all potentially relevant search terms, with regard to variant spellings and direct and indirect orders, in the most efficient manner. RESULTS When the strategies included variant spellings and direct and indirect orders, these adapted versions of the search strategies selected retrieved the same number of search results in MEDLINE (mean of 61.3%) and a higher number in EMBASE (mean of 63.9%) in the sample analyzed. The numbers of results retrieved through the searches analyzed here were not identical with and without associated use of MeSH and EMTREE terms. However, association of these terms from both controlled vocabularies retrieved a larger number of records than did the use of either one of them. CONCLUSIONS In view of these results, we recommend that the search terms used should include both preferred and non-preferred terms (i.e. variant spellings and direct/indirect order of the same term) and associated MeSH and EMTREE terms, in order to develop highly-sensitive search strategies for systematic reviews.
Collapse
Affiliation(s)
- Enilze de Souza Nogueira Volpato
- PhD. Doctoral Student, Postgraduate Program on Anesthesiology, Health Sciences Library, Faculdade de Medicina de Botucatu (FMB), Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Marluci Betini
- PhD. Doctoral Student, Postgraduate Program on Anesthesiology, Health Sciences Library, Faculdade de Medicina de Botucatu (FMB), Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Maria Eduarda Puga
- PhD. Coordinator, Coordenadoria da Rede de Bibliotecas da UNIFESP (CRBU), Universidade Federal de São Paulo (UNIFESP), São Paulo (SP), Brazil.
| | - Arnav Agarwal
- Undergraduate Medical Student, School of Medicine, University of Toronto, Toronto, Ontario, Canada, and Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada.
| | - Antônio José Maria Cataneo
- MD, PhD. Full Professor, Department of Surgery and Orthopedics, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Luciane Dias de Oliveira
- MSc, PhD. Associate Professor, Department of Biosciences and Oral Diagnosis, Institute of Science and Technology, Universidade Estadual Paulista (UNESP), São José dos Campos (SP), Brazil.
| | - Rodrigo Bazan
- MD. Assistant Professor, Department of Neurology, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Leandro Gobbo Braz
- MD. Assistant Professor, Department of Anesthesiology, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - José Eduardo Guimarães Pereira
- MD. Doctoral Student, Postgraduate Program on Anesthesiology, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Regina El Dib
- MSc, PhD. Assistant Professor, Department of Anesthesiology, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil; Assistant Professor, Department of Biosciences and Oral Diagnosis, Institute of Science and Technology, Universidade Estadual Paulista (UNESP), São José dos Campos (SP), Brazil; and Research Collaborator, Institute of Urology, McMaster University, Hamilton, Ontario, Canada.
| |
Collapse
|
43
|
Chen J, Scholz U, Zhou R, Lange M. LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions. PLoS Comput Biol 2018; 14:e1006058. [PMID: 29529024 PMCID: PMC5871001 DOI: 10.1371/journal.pcbi.1006058] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Revised: 03/27/2018] [Accepted: 02/23/2018] [Indexed: 11/19/2022] Open
Abstract
In order to access and filter content of life-science databases, full text search is a widely applied query interface. But its high flexibility and intuitiveness is paid for with potentially imprecise and incomplete query results. To reduce this drawback, query assistance systems suggest those combinations of keywords with the highest potential to match most of the relevant data records. Widespread approaches are syntactic query corrections that avoid misspelling and support expansion of words by suffixes and prefixes. Synonym expansion approaches apply thesauri, ontologies, and query logs. All need laborious curation and maintenance. Furthermore, access to query logs is in general restricted. Approaches that infer related queries by their query profile like research field, geographic location, co-authorship, affiliation etc. require user's registration and its public accessibility that contradict privacy concerns. To overcome these drawbacks, we implemented LAILAPS-QSM, a machine learning approach that reconstruct possible linguistic contexts of a given keyword query. The context is referred from the text records that are stored in the databases that are going to be queried or extracted for a general purpose query suggestion from PubMed abstracts and UniProt data. The supplied tool suite enables the pre-processing of these text records and the further computation of customized distributed word vectors. The latter are used to suggest alternative keyword queries. An evaluated of the query suggestion quality was done for plant science use cases. Locally present experts enable a cost-efficient quality assessment in the categories trait, biological entity, taxonomy, affiliation, and metabolic function which has been performed using ontology term similarities. LAILAPS-QSM mean information content similarity for 15 representative queries is 0.70, whereas 34% have a score above 0.80. In comparison, the information content similarity for human expert made query suggestions is 0.90. The software is either available as tool set to build and train dedicated query suggestion services or as already trained general purpose RESTful web service. The service uses open interfaces to be seamless embeddable into database frontends. The JAVA implementation uses highly optimized data structures and streamlined code to provide fast and scalable response for web service calls. The source code of LAILAPS-QSM is available under GNU General Public License version 2 in Bitbucket GIT repository: https://bitbucket.org/ipk_bit_team/bioescorte-suggestion.
Collapse
Affiliation(s)
- Jinbo Chen
- Research Group Bioinformatics and Information Technology, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland OT Gatersleben, Germany
| | - Uwe Scholz
- Research Group Bioinformatics and Information Technology, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland OT Gatersleben, Germany
| | - Ruonan Zhou
- Research Group Bioinformatics and Information Technology, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland OT Gatersleben, Germany
| | - Matthias Lange
- Research Group Bioinformatics and Information Technology, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland OT Gatersleben, Germany
| |
Collapse
|
44
|
Aguirre PEA, Coelho MM, Rios D, Machado MAAM, Cruvinel AFP, Cruvinel T. Evaluating the Dental Caries-Related Information on Brazilian Websites: Qualitative Study. J Med Internet Res 2017; 19:e415. [PMID: 29237585 PMCID: PMC5745348 DOI: 10.2196/jmir.7681] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 08/25/2017] [Accepted: 10/30/2017] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Dental caries is the most common chronic oral disease, affecting 2.4 billion people worldwide who on average have 2.11 decayed, missing, or filled teeth. It impacts the quality of life of patients, socially and economically. However, the comprehension of dental caries may be difficult for most people, as it involves a multifactorial etiology with the interplay between the tooth surface, the dental biofilm, dietary fermentable carbohydrates, and genetic and behavioral factors. Therefore, the production of effective materials addressed to the education and counseling of patients for the prevention of dental caries requires a high level of specialization. In this regard, the dental caries-related contents produced by laypersons and their availability on the Internet may be low-quality information. OBJECTIVE The aim of this study was to assess the readability and the quality of dental caries-related information on Brazilian websites. METHODS A total of 75 websites were selected through Google, Bing, Yahoo!, and Baidu. The websites were organized in rankings according to their order of appearance in each one of the 4 search engines. Furthermore, 2 independent examiners evaluated the quality of websites using the DISCERN questionnaire and the Journal of American Medical Association (JAMA) benchmark criteria. The readability of the websites was assessed by the Flesch Reading Ease adapted to Brazilian Portuguese (FRE-BP). In addition, the information presented on the websites was categorized as etiology, prevention, and treatment of dental caries. The statistical analysis was performed using Spearman rank correlation coefficient, Mann-Whitney U test, hierarchical clustering analysis by Ward minimum variance method, Kruskal-Wallis test, and post hoc Dunn test. P<.05 was considered significant. RESULTS The Web contents were considered to be of poor quality by DISCERN (mean 33.48, standard deviation, SD 9.06) and JAMA (mean 1.12, SD 0.97) scores, presenting easy reading levels (FRE-BP: mean 62.93, SD 10.15). The rankings of the websites presented by Google (ρ=-.22, P=.08), Baidu (ρ=-.19, P=.53), Yahoo! (ρ=.22, P=.39), and Bing (ρ=-.36, P=.23) were not correlated with DISCERN scores. Moreover, the quality of websites with health- and nonhealth-related authors was similar (P=.27 for DISCERN and P=.47 for JAMA); however, the pages with a greater variety of dental caries information showed significantly higher quality scores than those with limited contents (P=.009). CONCLUSIONS On the basis of this sample, dental caries-related contents available on Brazilian websites were considered simple, accessible, and of poor quality, independent of their authorship. These findings indicate the need for the development of specific policies focused on the stimulus for the production and publication of Web health information, encouraging dentists to guide their patients in searching for recommended oral health websites.
Collapse
Affiliation(s)
- Patricia Estefania Ayala Aguirre
- Department of Pediatric Dentistry, Orthodontics and Public Health, Bauru School of Dentistry, University of São Paulo, Bauru, Brazil
| | - Melina Martins Coelho
- Department of Pediatric Dentistry, Orthodontics and Public Health, Bauru School of Dentistry, University of São Paulo, Bauru, Brazil
| | - Daniela Rios
- Department of Pediatric Dentistry, Orthodontics and Public Health, Bauru School of Dentistry, University of São Paulo, Bauru, Brazil
| | | | | | - Thiago Cruvinel
- Department of Pediatric Dentistry, Orthodontics and Public Health, Bauru School of Dentistry, University of São Paulo, Bauru, Brazil
| |
Collapse
|
45
|
Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc 2017; 24:607-613. [PMID: 28339516 DOI: 10.1093/jamia/ocw144] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 08/31/2016] [Indexed: 12/19/2022] Open
Abstract
Objective The repurposing of electronic health records (EHRs) can improve clinical and genetic research for rare diseases. However, significant information in rare disease EHRs is embedded in the narrative reports, which contain many negated clinical signs and family medical history. This paper presents a method to detect family history and negation in narrative reports and evaluates its impact on selecting populations from a clinical data warehouse (CDW). Materials and Methods We developed a pipeline to process 1.6 million reports from multiple sources. This pipeline is part of the load process of the Necker Hospital CDW. Results We identified patients with "Lupus and diarrhea," "Crohn's and diabetes," and "NPHP1" from the CDW. The overall precision, recall, specificity, and F-measure were 0.85, 0.98, 0.93, and 0.91, respectively. Conclusion The proposed method generates a highly accurate identification of cases from a CDW of rare disease EHRs.
Collapse
Affiliation(s)
- Nicolas Garcelon
- Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France
- INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Antoine Neuraz
- Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France
- INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Vincent Benoit
- Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France
| | - Rémi Salomon
- Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France
- Service de Néphrologie Pédiatrique, Hôpital Necker-Enfants Malades, Assistance Publique -Hôpitaux de Paris (AP-HP), Université Paris Descartes, Sorbonne Paris Cité, France
| | - Anita Burgun
- INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France
- Hôpital Européen Georges Pompidou, Assistance Publique -Hôpitaux de Paris (AP-HP), Université Paris Descartes, Sorbonne Paris Cité, France
| |
Collapse
|
46
|
Tkachenko N, Chotvijit S, Gupta N, Bradley E, Gilks C, Guo W, Crosby H, Shore E, Thiarai M, Procter R, Jarvis S. Google Trends can improve surveillance of Type 2 diabetes. Sci Rep 2017; 7:4993. [PMID: 28694479 PMCID: PMC5504026 DOI: 10.1038/s41598-017-05091-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 05/31/2017] [Indexed: 11/17/2022] Open
Abstract
Recent studies demonstrate that people are increasingly looking online to assess their health, with reasons varying from personal preferences and beliefs to inability to book a timely appointment with their local medical practice. Records of these activities represent a new source of data about the health of populations, but which is currently unaccounted for by disease surveillance models. This could potentially be useful as evidence of individuals' perception of bodily changes and self-diagnosis of early symptoms of an emerging disease. We make use of the Experian geodemographic Mosaic dataset in order to extract Type 2 diabetes candidate risk variables and compare their temporal relationships with the search keywords, used to describe early symptoms of the disease on Google. Our results demonstrate that Google Trends can detect early signs of diabetes by monitoring combinations of keywords, associated with searches for hypertension treatment and poor living conditions; Combined search semantics, related to obesity, how to quit smoking and improve living conditions (deprivation) can be also employed, however, may lead to less accurate results.
Collapse
Affiliation(s)
- Nataliya Tkachenko
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK.
| | - Sarunkorn Chotvijit
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
| | - Neha Gupta
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
| | - Emma Bradley
- Experian, The Sir John Peace Building, Experian Way, NG2 Business Park, Nottingham, NG80 1ZZ, UK
| | - Charlotte Gilks
- Experian, The Sir John Peace Building, Experian Way, NG2 Business Park, Nottingham, NG80 1ZZ, UK
| | - Weisi Guo
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
- School of Engineering, University of Warwick, Coventry, CV4 7AL, UK
- The Alan Turing Institute, The British Library, London, NW1 2DB, UK
| | - Henry Crosby
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
| | - Eliot Shore
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
| | - Malkiat Thiarai
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
| | - Rob Procter
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL, UK
- The Alan Turing Institute, The British Library, London, NW1 2DB, UK
| | - Stephen Jarvis
- Warwick Institute for the Science of Cities, University of Warwick, Coventry, CV4 7AL, UK
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL, UK
- The Alan Turing Institute, The British Library, London, NW1 2DB, UK
| |
Collapse
|
47
|
Hunter P, Delbaere M, O’Connell ME, Cammer A, Seaton JX, Friedrich T, Fick F. Did online publishers "get it right"? Using a naturalistic search strategy to review cognitive health promotion content on internet webpages. BMC Geriatr 2017; 17:125. [PMID: 28619010 PMCID: PMC5472889 DOI: 10.1186/s12877-017-0515-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 06/06/2017] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND One of the most common uses of the Internet is to search for health-related information. Although scientific evidence pertaining to cognitive health promotion has expanded rapidly in recent years, it is unclear how much of this information has been made available to Internet users. Thus, the purpose of our study was to assess the reliability and quality of information about cognitive health promotion encountered by typical Internet users. METHODS To generate a list of relevant search terms employed by Internet users, we entered seed search terms in Google Trends and recorded any terms consistently used in the prior 2 years. To further approximate the behaviour of typical Internet users, we entered each term in Google and sampled the first two relevant results. This search, completed in October 2014, resulted in a sample of 86 webpages, 48 of which had content related to cognitive health promotion. An interdisciplinary team rated the information reliability and quality of these webpages using a standardized measure. RESULTS We found that information reliability and quality were moderate, on average. Just one retrieved page mentioned best practice, national recommendations, or consensus guidelines by name. Commercial content (i.e., product promotion, advertising content, or non-commercial) was associated with differences in reliability and quality, with product promoter webpages having the lowest mean reliability and quality ratings. CONCLUSIONS As efforts to communicate the association between lifestyle and cognitive health continue to expand, we offer these results as a baseline assessment of the reliability and quality of cognitive health promotion on the Internet.
Collapse
Affiliation(s)
- P.V. Hunter
- St. Thomas More College, University of Saskatchewan, 1437 College Drive, Saskatoon, SK S7M 0W6 Canada
| | - M. Delbaere
- Edwards School of Business, University of Saskatchewan, 25 Campus Drive, Saskatoon, SK S7N 5A7 Canada
| | - M. E. O’Connell
- Psychology, University of Saskatchewan, 9 Campus Drive, Saskatoon, SK S7N 5A5 Canada
| | - A. Cammer
- College of Pharmacy and Nutrition, University of Saskatchewan, 110 Science Place, Saskatoon, SK S7N 5C9 Canada
| | - J. X. Seaton
- Interdisciplinary Studies, University of Saskatchewan, 176 Thorvaldson Building, 110 Science Place, Saskatoon, SK S7N 5C9 Canada
| | - T. Friedrich
- Psychology, University of Saskatchewan, 9 Campus Drive, Saskatoon, SK S7N 5A5 Canada
| | - F. Fick
- Psychology, University of Saskatchewan, 9 Campus Drive, Saskatoon, SK S7N 5A5 Canada
| |
Collapse
|
48
|
Abstract
Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person's previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms "decided" whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.
Collapse
Affiliation(s)
- Mario Haim
- a Department of Communication Studies and Media Research , Ludwig Maximilians University Munich
| | - Florian Arendt
- a Department of Communication Studies and Media Research , Ludwig Maximilians University Munich
| | - Sebastian Scherr
- a Department of Communication Studies and Media Research , Ludwig Maximilians University Munich
| |
Collapse
|
49
|
Lelong R, Soualmia L, Dahamna B, Griffon N, Darmoni SJ. Querying EHRs with a Semantic and Entity-Oriented Query Language. Stud Health Technol Inform 2017; 235:121-125. [PMID: 28423767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
While the digitization of medical documents has greatly expanded during the past decade, health information retrieval has become a great challenge to address many issues in medical research. Information retrieval in electronic health records (EHR) should also reduce the difficult tasks of manual information retrieval from records in paper format or computer. The aim of this article was to present the features of a semantic search engine implemented in EHRs. A flexible, scalable and entity-oriented query language tool is proposed. The program is designed to retrieve and visualize data which can support any Conceptual Data Model. The search engine deals with structured and unstructured data, for a sole patient from a caregiver perspective, and for a number of patients (e.g. epidemiology). Several types of queries on a test database containing 2,000 anonymized patients EHRs (i.e. approximately 200,000 records) were tested. These queries were able to accurately treat symbolic, textual, numerical and chronological data.
Collapse
Affiliation(s)
- Romain Lelong
- Department of Biomedical Informatics, Rouen University Hospital, France
| | - Lina Soualmia
- Department of Biomedical Informatics, Rouen University Hospital, France
| | - Badisse Dahamna
- Department of Biomedical Informatics, Rouen University Hospital, France
| | - Nicolas Griffon
- Department of Biomedical Informatics, Rouen University Hospital, France
| | - Stéfan J Darmoni
- Department of Biomedical Informatics, Rouen University Hospital, France
| |
Collapse
|
50
|
Curti S, Gori D, Di Gregori V, Farioli A, Baldasseroni A, Fantini MP, Christiani DC, Violante FS, Mattioli S. PubMed search filters for the study of putative outdoor air pollution determinants of disease. BMJ Open 2016; 6:e013092. [PMID: 28003291 PMCID: PMC5223690 DOI: 10.1136/bmjopen-2016-013092] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
OBJECTIVES Several PubMed search filters have been developed in contexts other than environmental. We aimed at identifying efficient PubMed search filters for the study of environmental determinants of diseases related to outdoor air pollution. METHODS We compiled a list of Medical Subject Headings (MeSH) and non-MeSH terms seeming pertinent to outdoor air pollutants exposure as determinants of diseases in the general population. We estimated proportions of potentially pertinent articles to formulate two filters (one 'more specific', one 'more sensitive'). Their overall performance was evaluated as compared with our gold standard derived from systematic reviews on diseases potentially related to outdoor air pollution. We tested these filters in the study of three diseases potentially associated with outdoor air pollution and calculated the number of needed to read (NNR) abstracts to identify one potentially pertinent article in the context of these diseases. Last searches were run in January 2016. RESULTS The 'more specific' filter was based on the combination of terms that yielded a threshold of potentially pertinent articles ≥40%. The 'more sensitive' filter was based on the combination of all search terms under study. When compared with the gold standard, the 'more specific' filter reported the highest specificity (67.4%; with a sensitivity of 82.5%), while the 'more sensitive' one reported the highest sensitivity (98.5%; with a specificity of 47.9%). The NNR to find one potentially pertinent article was 1.9 for the 'more specific' filter and 3.3 for the 'more sensitive' one. CONCLUSIONS The proposed search filters could help healthcare professionals investigate environmental determinants of medical conditions that could be potentially related to outdoor air pollution.
Collapse
Affiliation(s)
- Stefania Curti
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Davide Gori
- Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna, Italy
| | - Valentina Di Gregori
- Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna, Italy
| | - Andrea Farioli
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Alberto Baldasseroni
- Tuscany Regional Centre for Occupational Injuries and Diseases (CeRIMP), Florence, Italy
| | - Maria Pia Fantini
- Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna, Italy
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Harvard University, Boston, Massachusetts, USA
| | - Francesco S Violante
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Stefano Mattioli
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| |
Collapse
|