1
|
Uthman OA, Court R, Enderby J, Al-Khudairy L, Nduka C, Mistry H, Melendez-Torres GJ, Taylor-Phillips S, Clarke A. Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning. Health Technol Assess 2022:10.3310/UDIR6682. [PMID: 36562494 PMCID: PMC10068584 DOI: 10.3310/udir6682] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. METHODS We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network ( CNN ), stacked CNN , parallel-stacked CNN , recurrent neural network ( RNN ) and CNN-RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. RESULTS We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as 'relevant' and 15,935 (96%) were tagged as 'irrelevant'. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN , yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. FUTURE WORK AND LIMITATIONS We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. CONCLUSION Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. FUNDING This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
| | - Rachel Court
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Jodie Enderby
- Warwick Medical School, University of Warwick, Coventry, UK
| | | | - Chidozie Nduka
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Hema Mistry
- Warwick Medical School, University of Warwick, Coventry, UK
| | - G J Melendez-Torres
- Peninsula Technology Assessment Group (PenTAG), College of Medicine and Health, University of Exeter, Exeter, UK
| | | | - Aileen Clarke
- Warwick Medical School, University of Warwick, Coventry, UK
| |
Collapse
|
2
|
Wilkins AA, Whaley P, Persad AS, Druwe IL, Lee JS, Taylor MM, Shapiro AJ, Blanton Southard N, Lemeris C, Thayer KA. Assessing author willingness to enter study information into structured data templates as part of the manuscript submission process: A pilot study. Heliyon 2022; 8:e09095. [PMID: 35846467 PMCID: PMC9280381 DOI: 10.1016/j.heliyon.2022.e09095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 02/16/2022] [Accepted: 03/08/2022] [Indexed: 12/04/2022] Open
Abstract
Background Environmental health and other researchers can benefit from automated or semi-automated summaries of data within published studies as summarizing study methods and results is time and resource intensive. Automated summaries can be designed to identify and extract details of interest pertaining to the study design, population, testing agent/intervention, or outcome (etc.). Much of the data reported across existing publications lack unified structure, standardization and machine-readable formats or may be presented in complex tables which serve as barriers that impede the development of automated data extraction methodologies. As full automation of data extraction seems unlikely soon, encouraging investigators to submit structured summaries of methods and results in standardized formats with meta-data tagging of content may be of value during the publication process. This would produce machine-readable content to facilitate automated data extraction, establish sharable data repositories, help make research data FAIR, and could improve reporting quality. Objectives A pilot study was conducted to assess the feasibility of asking participants to summarize study methods and results using a structured, web-based data extraction model as a potential workflow that could be implemented during the manuscript submission process. Methods Eight participants entered study details and data into the Health Assessment Workplace Collaborative (HAWC). Participants were surveyed after the extraction exercise to ascertain 1) whether this extraction exercise will impact their conducting and reporting of future research, 2) the ease of data extraction, including which fields were easiest and relatively more problematic to extract and 3) the amount of time taken to perform data extractions and other related tasks. Investigators then presented participants the potential benefits of providing structured data in the format they were extracting. After this, participants were surveyed about 1) their willingness to provide structured data during the publication process and 2) whether they felt the potential application of structured data entry approaches and their implementation during the journal submission process should continue to be further explored. Conclusions Routine provision of structured data that summarizes key information from research studies could reduce the amount of effort required for reusing that data in the future, such as in systematic reviews or agency scientific assessments. Our pilot study suggests that directly asking authors to provide that data, via structured templates, may be a viable approach to achieving this: participants were willing to do so, and the overall process was not prohibitively arduous. We also found some support for the hypothesis that use of study templates may have halo benefits in improving the conduct and completeness of reporting of future research. While limitations in the generalizability of our findings mean that the conditions of success of templates cannot be assumed, further research into how such templates might be designed and implemented does seem to have enough chance of success that it ought to be undertaken.
Collapse
Affiliation(s)
- A. Amina Wilkins
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
- Corresponding author.
| | - Paul Whaley
- Lancaster Environment Centre, Lancaster University, Lancaster, UK
- Evidence-Based Toxicology Collaboration, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| | - Amanda S. Persad
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| | - Ingrid L. Druwe
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| | - Janice S. Lee
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| | - Michele M. Taylor
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| | - Andrew J. Shapiro
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| | | | | | - Kristina A. Thayer
- U.S. Environmental Protection Agency (EPA), Center for Public Health and Environmental Assessment (CPHEA), Washington, DC, USA
| |
Collapse
|
3
|
Stansfield C, Stokes G, Thomas J. Applying machine classifiers to update searches: Analysis from two case studies. Res Synth Methods 2021; 13:121-133. [PMID: 34747151 PMCID: PMC9299040 DOI: 10.1002/jrsm.1537] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 09/22/2021] [Accepted: 10/18/2021] [Indexed: 11/29/2022]
Abstract
Manual screening of citation records could be reduced by using machine classifiers to remove records of very low relevance. This seems particularly feasible for update searches, where a machine classifier can be trained from past screening decisions. However, feasibility is unclear for broad topics. We evaluate the performance and implementation of machine classifiers for update searches of public health research using two case studies. The first study evaluates the impact of using different sets of training data on classifier performance, comparing recall and screening reduction with a manual screening ‘gold standard’. The second study uses screening decisions from a review to train a classifier that is applied to rank the update search results. A stopping threshold was applied in the absence of a gold standard. Time spent screening titles and abstracts of different relevancy‐ranked records was measured. Results: Study one: Classifier performance varies according to the training data used; all custom‐built classifiers had a recall above 93% at the same threshold, achieving screening reductions between 41% and 74%. Study two: applying a classifier provided a solution for tackling a large volume of search results from the update search, and screening volume was reduced by 61%. A tentative estimate indicates over 25 h screening time was saved. In conclusion, custom‐built machine classifiers are feasible for reducing screening workload from update searches across a range of public health interventions, with some limitation on recall. Key considerations include selecting a training dataset, agreeing stopping thresholds and processes to ensure smooth workflows.
Collapse
Affiliation(s)
- Claire Stansfield
- EPPI-Centre, UCL Social Research Institute, University College London, London, UK
| | - Gillian Stokes
- EPPI-Centre, UCL Social Research Institute, University College London, London, UK
| | - James Thomas
- EPPI-Centre, UCL Social Research Institute, University College London, London, UK
| |
Collapse
|
4
|
Abdelkader W, Navarro T, Parrish R, Cotoi C, Germini F, Iorio A, Haynes RB, Lokker C. Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant Evidence From the Biomedical Literature: Systematic Review. JMIR Med Inform 2021; 9:e30401. [PMID: 34499041 PMCID: PMC8461527 DOI: 10.2196/30401] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/15/2021] [Accepted: 07/25/2021] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND The rapid growth of the biomedical literature makes identifying strong evidence a time-consuming task. Applying machine learning to the process could be a viable solution that limits effort while maintaining accuracy. OBJECTIVE The goal of the research was to summarize the nature and comparative performance of machine learning approaches that have been applied to retrieve high-quality evidence for clinical consideration from the biomedical literature. METHODS We conducted a systematic review of studies that applied machine learning techniques to identify high-quality clinical articles in the biomedical literature. Multiple databases were searched to July 2020. Extracted data focused on the applied machine learning model, steps in the development of the models, and model performance. RESULTS From 3918 retrieved studies, 10 met our inclusion criteria. All followed a supervised machine learning approach and applied, from a limited range of options, a high-quality standard for the training of their model. The results show that machine learning can achieve a sensitivity of 95% while maintaining a high precision of 86%. CONCLUSIONS Machine learning approaches perform well in retrieving high-quality clinical studies. Performance may improve by applying more sophisticated approaches such as active learning and unsupervised machine learning approaches.
Collapse
Affiliation(s)
- Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - R Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
5
|
Chai KEK, Lines RLJ, Gucciardi DF, Ng L. Research Screener: a machine learning tool to semi-automate abstract screening for systematic reviews. Syst Rev 2021; 10:93. [PMID: 33795003 PMCID: PMC8017894 DOI: 10.1186/s13643-021-01635-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Systematic reviews and meta-analyses provide the highest level of evidence to help inform policy and practice, yet their rigorous nature is associated with significant time and economic demands. The screening of titles and abstracts is the most time consuming part of the review process with analysts required review thousands of articles manually, taking on average 33 days. New technologies aimed at streamlining the screening process have provided initial promising findings, yet there are limitations with current approaches and barriers to the widespread use of these tools. In this paper, we introduce and report initial evidence on the utility of Research Screener, a semi-automated machine learning tool to facilitate abstract screening. METHODS Three sets of analyses (simulation, interactive and sensitivity) were conducted to provide evidence of the utility of the tool through both simulated and real-world examples. RESULTS Research Screener delivered a workload saving of between 60 and 96% across nine systematic reviews and two scoping reviews. Findings from the real-world interactive analysis demonstrated a time saving of 12.53 days compared to the manual screening, which equates to a financial saving of USD 2444. Conservatively, our results suggest that analysts who scan 50% of the total pool of articles identified via a systematic search are highly likely to have identified 100% of eligible papers. CONCLUSIONS In light of these findings, Research Screener is able to reduce the burden for researchers wishing to conduct a comprehensive systematic review without reducing the scientific rigour for which they strive to achieve.
Collapse
Affiliation(s)
- Kevin E K Chai
- Curtin Institute for Computation, Curtin University, Perth, Australia
- School of Population Health, Curtin University, Perth, Australia
| | - Robin L J Lines
- School of Allied Health, Curtin University, Perth, Australia
| | | | - Leo Ng
- School of Allied Health, Curtin University, Perth, Australia.
| |
Collapse
|
6
|
Yamada T, Yoneoka D, Hiraike Y, Hino K, Toyoshiba H, Shishido A, Noma H, Shojima N, Yamauchi T. Deep Neural Network for Reducing the Screening Workload in Systematic Reviews for Clinical Guidelines: Algorithm Validation Study. J Med Internet Res 2020; 22:e22422. [PMID: 33262102 PMCID: PMC7806440 DOI: 10.2196/22422] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 11/10/2020] [Accepted: 11/30/2020] [Indexed: 01/16/2023] Open
Abstract
Background Performing systematic reviews is a time-consuming and resource-intensive process. Objective We investigated whether a machine learning system could perform systematic reviews more efficiently. Methods All systematic reviews and meta-analyses of interventional randomized controlled trials cited in recent clinical guidelines from the American Diabetes Association, American College of Cardiology, American Heart Association (2 guidelines), and American Stroke Association were assessed. After reproducing the primary screening data set according to the published search strategy of each, we extracted correct articles (those actually reviewed) and incorrect articles (those not reviewed) from the data set. These 2 sets of articles were used to train a neural network–based artificial intelligence engine (Concept Encoder, Fronteo Inc). The primary endpoint was work saved over sampling at 95% recall (WSS@95%). Results Among 145 candidate reviews of randomized controlled trials, 8 reviews fulfilled the inclusion criteria. For these 8 reviews, the machine learning system significantly reduced the literature screening workload by at least 6-fold versus that of manual screening based on WSS@95%. When machine learning was initiated using 2 correct articles that were randomly selected by a researcher, a 10-fold reduction in workload was achieved versus that of manual screening based on the WSS@95% value, with high sensitivity for eligible studies. The area under the receiver operating characteristic curve increased dramatically every time the algorithm learned a correct article. Conclusions Concept Encoder achieved a 10-fold reduction of the screening workload for systematic review after learning from 2 randomly selected studies on the target topic. However, few meta-analyses of randomized controlled trials were included. Concept Encoder could facilitate the acquisition of evidence for clinical guidelines.
Collapse
Affiliation(s)
- Tomohide Yamada
- University Institute for Population Health, King's College London, London, United Kingdom.,Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Daisuke Yoneoka
- Graduate School of Public Health, St Luke's International University, Tokyo, Japan
| | - Yuta Hiraike
- Department of Cell Biology, Harvard Medical School, Boston, MA, United States
| | | | | | | | - Hisashi Noma
- Department of Data Science, The Institute of Statistical Mathematics, Tokyo, Japan
| | - Nobuhiro Shojima
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Toshimasa Yamauchi
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| |
Collapse
|
7
|
Carvallo A, Parra D, Lobel H, Soto A. Automatic document screening of medical literature using word and text embeddings in an active learning setting. Scientometrics 2020. [DOI: 10.1007/s11192-020-03648-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
8
|
Alharbi A, Stevenson M. Refining Boolean queries to identify relevant studies for systematic review updates. J Am Med Inform Assoc 2020; 27:1658-1666. [PMID: 33067630 PMCID: PMC7750994 DOI: 10.1093/jamia/ocaa148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 06/09/2020] [Accepted: 06/23/2020] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Systematic reviews are important in health care but are expensive to produce and maintain. The authors explore the use of automated transformations of Boolean queries to improve the identification of relevant studies for updates to systematic reviews. MATERIALS AND METHODS A set of query transformations, including operator substitution, query expansion, and query reduction, were used to iteratively modify the Boolean query used for the original systematic review. The most effective transformation at each stage is identified using information about the studies included and excluded from the original review. A dataset consisting of 22 systematic reviews was used for evaluation. Updated queries were evaluated using the included and excluded studies from the updated version of the review. Recall and precision were used as evaluation measures. RESULTS The updated queries were more effective than the ones used for the original review, in terms of both precision and recall. The overall number of documents retrieved was reduced by more than half, while the number of relevant documents found increased by 10.3%. CONCLUSIONS Identification of relevant studies for updates to systematic reviews can be carried out more effectively by using information about the included and excluded studies from the original review to produce improved Boolean queries. These updated queries reduce the overall number of documents retrieved while also increasing the number of relevant documents identified, thereby representing a considerable reduction in effort required by systematic reviewers.
Collapse
Affiliation(s)
- Amal Alharbi
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| | - Mark Stevenson
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
9
|
Howard BE, Phillips J, Tandon A, Maharana A, Elmore R, Mav D, Sedykh A, Thayer K, Merrick BA, Walker V, Rooney A, Shah RR. SWIFT-Active Screener: Accelerated document screening through active learning and integrated recall estimation. ENVIRONMENT INTERNATIONAL 2020; 138:105623. [PMID: 32203803 PMCID: PMC8082972 DOI: 10.1016/j.envint.2020.105623] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 02/13/2020] [Accepted: 02/28/2020] [Indexed: 05/19/2023]
Abstract
BACKGROUND In the screening phase of systematic review, researchers use detailed inclusion/exclusion criteria to decide whether each article in a set of candidate articles is relevant to the research question under consideration. A typical review may require screening thousands or tens of thousands of articles in and can utilize hundreds of person-hours of labor. METHODS Here we introduce SWIFT-Active Screener, a web-based, collaborative systematic review software application, designed to reduce the overall screening burden required during this resource-intensive phase of the review process. To prioritize articles for review, SWIFT-Active Screener uses active learning, a type of machine learning that incorporates user feedback during screening. Meanwhile, a negative binomial model is employed to estimate the number of relevant articles remaining in the unscreened document list. Using a simulation involving 26 diverse systematic review datasets that were previously screened by reviewers, we evaluated both the document prioritization and recall estimation methods. RESULTS On average, 95% of the relevant articles were identified after screening only 40% of the total reference list. In the 5 document sets with 5,000 or more references, 95% recall was achieved after screening only 34% of the available references, on average. Furthermore, the recall estimator we have proposed provides a useful, conservative estimate of the percentage of relevant documents identified during the screening process. CONCLUSION SWIFT-Active Screener can result in significant time savings compared to traditional screening and the savings are increased for larger project sizes. Moreover, the integration of explicit recall estimation during screening solves an important challenge faced by all machine learning systems for document screening: when to stop screening a prioritized reference list. The software is currently available in the form of a multi-user, collaborative, online web application.
Collapse
Affiliation(s)
| | | | | | | | | | - Deepak Mav
- Sciome LLC, 2 Davis Drive Durham, NC 27709, USA
| | - Alex Sedykh
- Sciome LLC, 2 Davis Drive Durham, NC 27709, USA
| | - Kristina Thayer
- Integrated Risk Information System (IRIS) Division, Environmental Protection Agency, 109 T.W. Alexander Drive RTP, NC 27709, USA
| | - B Alex Merrick
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | - Vickie Walker
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | - Andrew Rooney
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | | |
Collapse
|
10
|
Stoll C, Izadi S, Fowler S, Green P, Suls J, Colditz GA. The value of a second reviewer for study selection in systematic reviews. Res Synth Methods 2019; 10:539-545. [PMID: 31272125 PMCID: PMC6989049 DOI: 10.1002/jrsm.1369] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 03/30/2019] [Accepted: 07/02/2019] [Indexed: 11/06/2022]
Abstract
BACKGROUND Although dual independent review of search results by two reviewers is generally recommended for systematic reviews, there are not consistent recommendations regarding the timing of the use of the second reviewer. This study compared the use of a complete dual review approach, with two reviewers in both the title/abstract screening stage and the full-text screening stage, as compared with a limited dual review approach, with two reviewers only in the full-text stage. METHODS This study was performed within the context of a large systematic review. Two reviewers performed a complete dual review of 15 000 search results and a limited dual review of 15 000 search results. The number of relevant studies mistakenly excluded by highly experienced reviewers in the complete dual review was compared with the number mistakenly excluded during the full-text stage of the limited dual review. RESULTS In the complete dual review approach, an additional 6.6% to 9.1% of eligible studies were identified during the title/abstract stage by using two reviewers, and an additional 6.6% to 11.9% of eligible studies were identified during the full-text stage by using two reviewers. In the limited dual review approach, an additional 4.4% to 5.3% of eligible studies were identified with the use of two reviewers. CONCLUSIONS Using a second reviewer throughout the entire study screening process can increase the number of relevant studies identified for use in a systematic review. Systematic review performers should consider using a complete dual review process to ensure all relevant studies are included in their review.
Collapse
Affiliation(s)
- Carolyn Stoll
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, Saint Louis, MO
| | - Sonya Izadi
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, Saint Louis, MO
| | - Susan Fowler
- Brown School, Washington University School of Medicine, Saint Louis, MO
| | - Paige Green
- Behavioral Research Program, Division of Cancer Control & Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Jerry Suls
- Behavioral Research Program, Division of Cancer Control & Population Sciences, National Cancer Institute, Bethesda, Maryland
| | - Graham A. Colditz
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, Saint Louis, MO
| |
Collapse
|
11
|
Giummarra MJ, Lau G, Gabbe BJ. Evaluation of text mining to reduce screening workload for injury-focused systematic reviews. Inj Prev 2019; 26:55-60. [PMID: 31451565 DOI: 10.1136/injuryprev-2019-043247] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 07/11/2019] [Accepted: 07/13/2019] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Text mining to support screening in large-scale systematic reviews has been recommended; however, their suitability for reviews in injury research is not known. We examined the performance of text mining in supporting the second reviewer in a systematic review examining associations between fault attribution and health and work-related outcomes after transport injury. METHODS Citations were independently screened in Abstrackr in full (reviewer 1; 10 559 citations), and until no more citations were predicted to be relevant (reviewer 2; 1809 citations, 17.1%). All potentially relevant full-text articles were assessed by reviewer 1 (555 articles). Reviewer 2 used text mining (Wordstat, QDA Miner) to reduce assessment to full-text articles containing ≥1 fault-related exposure term (367 articles, 66.1%). RESULTS Abstrackr offered excellent workload savings: 82.7% of citations did not require screening by reviewer 2, and total screening time was reduced by 36.6% compared with traditional dual screening of all citations. Abstrackr predictions had high specificity (83.7%), and low false negatives (0.3%), but overestimated citation relevance, probably due to the complexity of the review with multiple outcomes and high imbalance of relevant to irrelevant records, giving low sensitivity (29.7%) and precision (14.5%). Text mining of full-text articles reduced the number needing to be screened by 33.9%, and reduced total full-text screening time by 38.7% compared with traditional dual screening. CONCLUSIONS Overall, text mining offered important benefits to systematic review workflow, but should not replace full screening by one reviewer, especially for complex reviews examining multiple health or injury outcomes. TRIAL REGISTRATION NUMBER CRD42018084123.
Collapse
Affiliation(s)
- Melita J Giummarra
- Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia .,Caulfield Pain Management and Research Centre, Caulfield Hospital, Caulfield, Victoria, Australia
| | - Georgina Lau
- Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Belinda J Gabbe
- Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
12
|
Schmitz T, Bukowski M, Koschmieder S, Schmitz-Rode T, Farkas R. Potential Technologies Review: A hybrid information retrieval framework to accelerate demand-pull innovation in biomedical engineering. Res Synth Methods 2019; 10:420-439. [PMID: 30995361 DOI: 10.1002/jrsm.1350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 02/01/2019] [Accepted: 04/11/2019] [Indexed: 11/11/2022]
Affiliation(s)
- Tom Schmitz
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Mark Bukowski
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Steffen Koschmieder
- Department of Hematology, Oncology, Hemostaseology, and Stem Cell Transplantation, RWTH Aachen University, Aachen, Germany
| | - Thomas Schmitz-Rode
- Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Robert Farkas
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
13
|
Bannach-Brown A, Przybyła P, Thomas J, Rice ASC, Ananiadou S, Liao J, Macleod MR. Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 2019; 8:23. [PMID: 30646959 PMCID: PMC6334440 DOI: 10.1186/s13643-019-0942-7] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 01/03/2019] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review. METHODS We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis). RESULTS ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. CONCLUSIONS This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology.
Collapse
Affiliation(s)
- Alexandra Bannach-Brown
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark
- Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia
| | - Piotr Przybyła
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - James Thomas
- EPPI-Centre, Department of Social Science, University College London, London, England
| | - Andrew S. C. Rice
- Pain Research, Department of Surgery and Cancer, Imperial College, London, England
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - Jing Liao
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | | |
Collapse
|
14
|
Bannach-Brown A, Przybyła P, Thomas J, Rice ASC, Ananiadou S, Liao J, Macleod MR. Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 2019. [PMID: 30646959 DOI: 10.1186/s13643‐019‐0942‐7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review. METHODS We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis). RESULTS ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. CONCLUSIONS This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology.
Collapse
Affiliation(s)
- Alexandra Bannach-Brown
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland. .,Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark. .,Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
| | - Piotr Przybyła
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - James Thomas
- EPPI-Centre, Department of Social Science, University College London, London, England
| | - Andrew S C Rice
- Pain Research, Department of Surgery and Cancer, Imperial College, London, England
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - Jing Liao
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | | |
Collapse
|
15
|
Pham B, Bagheri E, Rios P, Pourmasoumi A, Robson RC, Hwee J, Isaranuwatchai W, Darvesh N, Page MJ, Tricco AC. Improving the conduct of systematic reviews: a process mining perspective. J Clin Epidemiol 2018; 103:101-111. [DOI: 10.1016/j.jclinepi.2018.06.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 06/19/2018] [Accepted: 06/26/2018] [Indexed: 01/10/2023]
|
16
|
Przybyła P, Brockmeier AJ, Kontonatsios G, Le Pogam M, McNaught J, von Elm E, Nolan K, Ananiadou S. Prioritising references for systematic reviews with RobotAnalyst: A user study. Res Synth Methods 2018; 9:470-488. [PMID: 29956486 PMCID: PMC6175382 DOI: 10.1002/jrsm.1311] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 04/12/2018] [Accepted: 06/16/2018] [Indexed: 11/07/2022]
Abstract
Screening references is a time-consuming step necessary for systematic reviews and guideline development. Previous studies have shown that human effort can be reduced by using machine learning software to prioritise large reference collections such that most of the relevant references are identified before screening is completed. We describe and evaluate RobotAnalyst, a Web-based software system that combines text-mining and machine learning algorithms for organising references by their content and actively prioritising them based on a relevancy classification model trained and updated throughout the process. We report an evaluation over 22 reference collections (most are related to public health topics) screened using RobotAnalyst with a total of 43 610 abstract-level decisions. The number of references that needed to be screened to identify 95% of the abstract-level inclusions for the evidence review was reduced on 19 of the 22 collections. Significant gains over random sampling were achieved for all reviews conducted with active prioritisation, as compared with only two of five when prioritisation was not used. RobotAnalyst's descriptive clustering and topic modelling functionalities were also evaluated by public health analysts. Descriptive clustering provided more coherent organisation than topic modelling, and the content of the clusters was apparent to the users across a varying number of clusters. This is the first large-scale study using technology-assisted screening to perform new reviews, and the positive results provide empirical evidence that RobotAnalyst can accelerate the identification of relevant studies. The results also highlight the issue of user complacency and the need for a stopping criterion to realise the work savings.
Collapse
Affiliation(s)
- Piotr Przybyła
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Austin J. Brockmeier
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Georgios Kontonatsios
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Marie‐Annick Le Pogam
- Cochrane Switzerland, Institute of Social and Preventive MedicineLausanne University HospitalLausanneSwitzerland
| | - John McNaught
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Erik von Elm
- Cochrane Switzerland, Institute of Social and Preventive MedicineLausanne University HospitalLausanneSwitzerland
| | - Kay Nolan
- National Institute for Health and Care ExcellenceManchesterUK
| | - Sophia Ananiadou
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| |
Collapse
|
17
|
Maas AIR, Menon DK, Adelson PD, Andelic N, Bell MJ, Belli A, Bragge P, Brazinova A, Büki A, Chesnut RM, Citerio G, Coburn M, Cooper DJ, Crowder AT, Czeiter E, Czosnyka M, Diaz-Arrastia R, Dreier JP, Duhaime AC, Ercole A, van Essen TA, Feigin VL, Gao G, Giacino J, Gonzalez-Lara LE, Gruen RL, Gupta D, Hartings JA, Hill S, Jiang JY, Ketharanathan N, Kompanje EJO, Lanyon L, Laureys S, Lecky F, Levin H, Lingsma HF, Maegele M, Majdan M, Manley G, Marsteller J, Mascia L, McFadyen C, Mondello S, Newcombe V, Palotie A, Parizel PM, Peul W, Piercy J, Polinder S, Puybasset L, Rasmussen TE, Rossaint R, Smielewski P, Söderberg J, Stanworth SJ, Stein MB, von Steinbüchel N, Stewart W, Steyerberg EW, Stocchetti N, Synnot A, Te Ao B, Tenovuo O, Theadom A, Tibboel D, Videtta W, Wang KKW, Williams WH, Wilson L, Yaffe K, Adams H, Agnoletti V, Allanson J, Amrein K, Andaluz N, Anke A, Antoni A, van As AB, Audibert G, Azaševac A, Azouvi P, Azzolini ML, Baciu C, Badenes R, Barlow KM, Bartels R, Bauerfeind U, Beauchamp M, Beer D, Beer R, Belda FJ, Bellander BM, Bellier R, Benali H, Benard T, Beqiri V, Beretta L, Bernard F, Bertolini G, Bilotta F, Blaabjerg M, den Boogert H, Boutis K, Bouzat P, Brooks B, Brorsson C, Bullinger M, Burns E, Calappi E, Cameron P, Carise E, Castaño-León AM, Causin F, Chevallard G, Chieregato A, Christie B, Cnossen M, Coles J, Collett J, Della Corte F, Craig W, Csato G, Csomos A, Curry N, Dahyot-Fizelier C, Dawes H, DeMatteo C, Depreitere B, Dewey D, van Dijck J, Đilvesi Đ, Dippel D, Dizdarevic K, Donoghue E, Duek O, Dulière GL, Dzeko A, Eapen G, Emery CA, English S, Esser P, Ezer E, Fabricius M, Feng J, Fergusson D, Figaji A, Fleming J, Foks K, Francony G, Freedman S, Freo U, Frisvold SK, Gagnon I, Galanaud D, Gantner D, Giraud B, Glocker B, Golubovic J, Gómez López PA, Gordon WA, Gradisek P, Gravel J, Griesdale D, Grossi F, Haagsma JA, Håberg AK, Haitsma I, Van Hecke W, Helbok R, Helseth E, van Heugten C, Hoedemaekers C, Höfer S, Horton L, Hui J, Huijben JA, Hutchinson PJ, Jacobs B, van der Jagt M, Jankowski S, Janssens K, Jelaca B, Jones KM, Kamnitsas K, Kaps R, Karan M, Katila A, Kaukonen KM, De Keyser V, Kivisaari R, Kolias AG, Kolumbán B, Kolundžija K, Kondziella D, Koskinen LO, Kovács N, Kramer A, Kutsogiannis D, Kyprianou T, Lagares A, Lamontagne F, Latini R, Lauzier F, Lazar I, Ledig C, Lefering R, Legrand V, Levi L, Lightfoot R, Lozano A, MacDonald S, Major S, Manara A, Manhes P, Maréchal H, Martino C, Masala A, Masson S, Mattern J, McFadyen B, McMahon C, Meade M, Melegh B, Menovsky T, Moore L, Morgado Correia M, Morganti-Kossmann MC, Muehlan H, Mukherjee P, Murray L, van der Naalt J, Negru A, Nelson D, Nieboer D, Noirhomme Q, Nyirádi J, Oddo M, Okonkwo DO, Oldenbeuving AW, Ortolano F, Osmond M, Payen JF, Perlbarg V, Persona P, Pichon N, Piippo-Karjalainen A, Pili-Floury S, Pirinen M, Ple H, Poca MA, Posti J, Van Praag D, Ptito A, Radoi A, Ragauskas A, Raj R, Real RGL, Reed N, Rhodes J, Robertson C, Rocka S, Røe C, Røise O, Roks G, Rosand J, Rosenfeld JV, Rosenlund C, Rosenthal G, Rossi S, Rueckert D, de Ruiter GCW, Sacchi M, Sahakian BJ, Sahuquillo J, Sakowitz O, Salvato G, Sánchez-Porras R, Sándor J, Sangha G, Schäfer N, Schmidt S, Schneider KJ, Schnyer D, Schöhl H, Schoonman GG, Schou RF, Sir Ö, Skandsen T, Smeets D, Sorinola A, Stamatakis E, Stevanovic A, Stevens RD, Sundström N, Taccone FS, Takala R, Tanskanen P, Taylor MS, Telgmann R, Temkin N, Teodorani G, Thomas M, Tolias CM, Trapani T, Turgeon A, Vajkoczy P, Valadka AB, Valeinis E, Vallance S, Vámos Z, Vargiolu A, Vega E, Verheyden J, Vik A, Vilcinis R, Vleggeert-Lankamp C, Vogt L, Volovici V, Voormolen DC, Vulekovic P, Vande Vyvere T, Van Waesberghe J, Wessels L, Wildschut E, Williams G, Winkler MKL, Wolf S, Wood G, Xirouchaki N, Younsi A, Zaaroor M, Zelinkova V, Zemek R, Zumbo F. Traumatic brain injury: integrated approaches to improve prevention, clinical care, and research. Lancet Neurol 2017; 16:987-1048. [DOI: 10.1016/s1474-4422(17)30371-x] [Citation(s) in RCA: 822] [Impact Index Per Article: 117.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 07/06/2017] [Accepted: 09/27/2017] [Indexed: 12/11/2022]
|
18
|
Rathbone J, Albarqouni L, Bakhit M, Beller E, Byambasuren O, Hoffmann T, Scott AM, Glasziou P. Expediting citation screening using PICo-based title-only screening for identifying studies in scoping searches and rapid reviews. Syst Rev 2017; 6:233. [PMID: 29178925 PMCID: PMC5702220 DOI: 10.1186/s13643-017-0629-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 11/16/2017] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Citation screening for scoping searches and rapid review is time-consuming and inefficient, often requiring days or sometimes months to complete. We examined the reliability of PICo-based title-only screening using keyword searches based on the PICo elements-Participants, Interventions, and Comparators, but not the Outcomes. METHODS A convenience sample of 10 datasets, derived from the literature searches of completed systematic reviews, was used to test PICo-based title-only screening. Search terms for screening were generated from the inclusion criteria of each review, specifically the PICo elements-Participants, Interventions and Comparators. Synonyms for the PICo terms were sought, including alternatives for clinical conditions, trade names of generic drugs and abbreviations for clinical conditions, interventions and comparators. The MeSH database, Wikipedia, Google searches and online thesauri were used to assist generating terms. Title-only screening was performed by five reviewers independently in Endnote X7 reference management software using OR Boolean operator. Outcome measures were recall of included studies and the reduction in screening effort. Recall is the proportion of included studies retrieved using PICo title-only screening out of the total number of included studies in the original reviews. The percentage reduction in screening effort is the proportion of records not needing screening because the method eliminates them from the screen set. RESULTS Across the 10 reviews, the reduction in screening effort ranged from 11 to 78% with a median reduction of 53%. In nine systematic reviews, the recall of included studies was 100%. In one review (oxygen therapy), four of five reviewers missed the same included study (median recall 67%). A post hoc analysis was performed on the dataset with the lowest reduction in screening effort (11%), and it was rescreened using only the intervention and comparator keywords and omitting keywords for participants. The reduction in screening effort increased to 57%, and the recall of included studies was maintained (100%). CONCLUSIONS In this sample of datasets, PICo-based title-only screening was able to expedite citation screening for scoping searches and rapid reviews by reducing the number of citations needed to screen but requires a thorough workup of the potential synonyms and alternative terms. Further research which evaluates the feasibility of this technique with heterogeneous datasets in different fields would be useful to inform the generalisability of this technique.
Collapse
Affiliation(s)
- John Rathbone
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia.
| | - Loai Albarqouni
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Mina Bakhit
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Elaine Beller
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Oyungerel Byambasuren
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Tammy Hoffmann
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Anna Mae Scott
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| | - Paul Glasziou
- Centre for Research in Evidence Based Practice, Bond University, Gold Coast, Australia
| |
Collapse
|
19
|
Olorisade BK, Brereton P, Andras P. Reproducibility of studies on text mining for citation screening in systematic reviews: Evaluation and checklist. J Biomed Inform 2017; 73:1-13. [PMID: 28711679 DOI: 10.1016/j.jbi.2017.07.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Revised: 07/08/2017] [Accepted: 07/10/2017] [Indexed: 11/28/2022]
Abstract
CONTEXT Independent validation of published scientific results through study replication is a pre-condition for accepting the validity of such results. In computation research, full replication is often unrealistic for independent results validation, therefore, study reproduction has been justified as the minimum acceptable standard to evaluate the validity of scientific claims. The application of text mining techniques to citation screening in the context of systematic literature reviews is a relatively young and growing computational field with high relevance for software engineering, medical research and other fields. However, there is little work so far on reproduction studies in the field. OBJECTIVE In this paper, we investigate the reproducibility of studies in this area based on information contained in published articles and we propose reporting guidelines that could improve reproducibility. METHODS The study was approached in two ways. Initially we attempted to reproduce results from six studies, which were based on the same raw dataset. Then, based on this experience, we identified steps considered essential to successful reproduction of text mining experiments and characterized them to measure how reproducible is a study given the information provided on these steps. 33 articles were systematically assessed for reproducibility using this approach. RESULTS Our work revealed that it is currently difficult if not impossible to independently reproduce the results published in any of the studies investigated. The lack of information about the datasets used limits reproducibility of about 80% of the studies assessed. Also, information about the machine learning algorithms is inadequate in about 27% of the papers. On the plus side, the third party software tools used are mostly free and available. CONCLUSIONS The reproducibility potential of most of the studies can be significantly improved if more attention is paid to information provided on the datasets used, how they were partitioned and utilized, and how any randomization was controlled. We introduce a checklist of information that needs to be provided in order to ensure that a published study can be reproduced.
Collapse
Affiliation(s)
| | - Pearl Brereton
- School of Computing and Mathematics, Keele University, Staffs ST5 5BG, UK.
| | - Peter Andras
- School of Computing and Mathematics, Keele University, Staffs ST5 5BG, UK.
| |
Collapse
|
20
|
Shim S, Kim J, Jung W, Shin IS, Bae JM. Meta-analysis for genome-wide association studies using case-control design: application and practice. Epidemiol Health 2016; 38:e2016058. [PMID: 28092928 PMCID: PMC5309730 DOI: 10.4178/epih.e2016058] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Accepted: 12/18/2016] [Indexed: 01/16/2023] Open
Abstract
This review aimed to arrange the process of a systematic review of genome-wide association studies in order to practice and apply a genome-wide meta-analysis (GWMA). The process has a series of five steps: searching and selection, extraction of related information, evaluation of validity, meta-analysis by type of genetic model, and evaluation of heterogeneity. In contrast to intervention meta-analyses, GWMA has to evaluate the Hardy-Weinberg equilibrium (HWE) in the third step and conduct meta-analyses by five potential genetic models, including dominant, recessive, homozygote contrast, heterozygote contrast, and allelic contrast in the fourth step. The 'genhwcci' and 'metan' commands of STATA software evaluate the HWE and calculate a summary effect size, respectively. A meta-regression using the 'metareg' command of STATA should be conducted to evaluate related factors of heterogeneities.
Collapse
Affiliation(s)
- Sungryul Shim
- Institute for Clinical Molecular Biology Research, Soonchunhyang University Hospital, Seoul, Korea
| | - Jiyoung Kim
- Department of Radiation Oncology, Ewha Womans University School of Medicine, Seoul, Korea
| | - Wonguen Jung
- Department of Radiation Oncology, Ewha Womans University School of Medicine, Seoul, Korea
| | - In-Soo Shin
- Department of Education, Jeonju University, Jeonju, Korea
| | - Jong-Myon Bae
- Department of Preventive Medicine, Jeju National University School of Medicine, Jeju, Korea
| |
Collapse
|
21
|
Liu T, Zhang C, Liu C. The incidence of breast cancer among female flight attendants: an updated meta-analysis. J Travel Med 2016; 23:taw055. [PMID: 27601531 DOI: 10.1093/jtm/taw055] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 08/01/2016] [Indexed: 11/14/2022]
Abstract
BACKGROUND Several studies have indicated an increased risk of breast cancer (BC) among female flight attendants (FFAs); however, the results from epidemiological studies were not consistent. We thus conducted an updated meta-analysis to re-assess the risk of BC among FFAs, according to the MOOSE guideline. METHODS A systematical search of PubMed and Embase for relevant observational studies up to March 2016 was performed, supplemented by manual reviews of bibliographies in relevant studies. A random effect model was conducted to calculate the combined standard incidence ratio (SIR) and 95% confidence interval (95% CI) in BC risk. RESULTS Of the 719 citations retrieved, 10 were included, with more than 31 679 participants and 821 new cases. The combined SIR (95% CI) for BC in FFAs was 1.40 (95%CI 1.30-1.50), with no significant heterogeneity (P = 0.744; I(2 )=( )0.0%) or publication bias (Begg's test: z = 0.72, P = 0.474; Egger's test: t = 0.25, P = 0.805) among the included studies. The results were not significantly modified by publication year, geographic area, study quality or whether the fertility variables were adjusted. CONCLUSIONS Our meta-analysis suggests that FFAs have a higher risk of BC compared with the general population. More vigorous studies with larger sample sizes based on other populations, including the Chinese, are needed.
Collapse
Affiliation(s)
- Tiebing Liu
- Civil Aviation Medicine Center, Civil Aviation Administration of China, Beijing, People's Republic of China
| | - Chanyuan Zhang
- Department of Clinical Laboratory, Civil Aviation General Hospital, Beijing, People's Republic of China
| | - Chong Liu
- Department of Information Engineering, Cangzhou Technical College, Cangzhou, Hebei, People's Republic of China
| |
Collapse
|
22
|
Abbe A, Grouin C, Zweigenbaum P, Falissard B. Text mining applications in psychiatry: a systematic literature review. Int J Methods Psychiatr Res 2016; 25:86-100. [PMID: 26184780 PMCID: PMC6877250 DOI: 10.1002/mpr.1481] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Revised: 01/21/2015] [Accepted: 04/09/2015] [Indexed: 11/08/2022] Open
Abstract
The expansion of biomedical literature is creating the need for efficient tools to keep pace with increasing volumes of information. Text mining (TM) approaches are becoming essential to facilitate the automated extraction of useful biomedical information from unstructured text. We reviewed the applications of TM in psychiatry, and explored its advantages and limitations. A systematic review of the literature was carried out using the CINAHL, Medline, EMBASE, PsycINFO and Cochrane databases. In this review, 1103 papers were screened, and 38 were included as applications of TM in psychiatric research. Using TM and content analysis, we identified four major areas of application: (1) Psychopathology (i.e. observational studies focusing on mental illnesses) (2) the Patient perspective (i.e. patients' thoughts and opinions), (3) Medical records (i.e. safety issues, quality of care and description of treatments), and (4) Medical literature (i.e. identification of new scientific information in the literature). The information sources were qualitative studies, Internet postings, medical records and biomedical literature. Our work demonstrates that TM can contribute to complex research tasks in psychiatry. We discuss the benefits, limits, and further applications of this tool in the future. Copyright © 2015 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Adeline Abbe
- Inserm, U669, Paris, France.,University Paris-Sud and University Paris Descartes, UMR-S0669, Paris, France
| | | | | | - Bruno Falissard
- Inserm, U669, Paris, France.,University Paris-Sud and University Paris Descartes, UMR-S0669, Paris, France
| |
Collapse
|
23
|
|
24
|
Association of sedentary behavior with the risk of breast cancer in women: update meta-analysis of observational studies. Ann Epidemiol 2015; 25:687-97. [DOI: 10.1016/j.annepidem.2015.05.007] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Revised: 04/30/2015] [Accepted: 05/07/2015] [Indexed: 11/21/2022]
|
25
|
Rathbone J, Hoffmann T, Glasziou P. Faster title and abstract screening? Evaluating Abstrackr, a semi-automated online screening program for systematic reviewers. Syst Rev 2015; 4:80. [PMID: 26073974 PMCID: PMC4472176 DOI: 10.1186/s13643-015-0067-6] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 05/29/2015] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Citation screening is time consuming and inefficient. We sought to evaluate the performance of Abstrackr, a semi-automated online tool for predictive title and abstract screening. METHODS Four systematic reviews (aHUS, dietary fibre, ECHO, rituximab) were used to evaluate Abstrackr. Citations from electronic searches of biomedical databases were imported into Abstrackr, and titles and abstracts were screened and included or excluded according to the entry criteria. This process was continued until Abstrackr predicted and classified the remaining unscreened citations as relevant or irrelevant. These classification predictions were checked for accuracy against the original review decisions. Sensitivity analyses were performed to assess the effects of including case reports in the aHUS dataset whilst screening and the effects of using larger imbalanced datasets with the ECHO dataset. The performance of Abstrackr was calculated according to the number of relevant studies missed, the workload saving, the false negative rate, and the precision of the algorithm to correctly predict relevant studies for inclusion, i.e. further full text inspection. RESULTS Of the unscreened citations, Abstrackr's prediction algorithm correctly identified all relevant citations for the rituximab and dietary fibre reviews. However, one relevant citation in both the aHUS and ECHO reviews was incorrectly predicted as not relevant. The workload saving achieved with Abstrackr varied depending on the complexity and size of the reviews (9 % rituximab, 40 % dietary fibre, 67 % aHUS, and 57 % ECHO). The proportion of citations predicted as relevant, and therefore, warranting further full text inspection (i.e. the precision of the prediction) ranged from 16 % (aHUS) to 45 % (rituximab) and was affected by the complexity of the reviews. The false negative rate ranged from 2.4 to 21.7 %. Sensitivity analysis performed on the aHUS dataset increased the precision from 16 to 25 % and increased the workload saving by 10 % but increased the number of relevant studies missed. Sensitivity analysis performed with the larger ECHO dataset increased the workload saving (80 %) but reduced the precision (6.8 %) and increased the number of missed citations. CONCLUSIONS Semi-automated title and abstract screening with Abstrackr has the potential to save time and reduce research waste.
Collapse
Affiliation(s)
- John Rathbone
- Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
| | - Tammy Hoffmann
- Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
| | - Paul Glasziou
- Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
| |
Collapse
|
26
|
Stewart GB, Higgins JPT, Schünemann H, Meader N. The use of Bayesian networks to assess the quality of evidence from research synthesis: 1. PLoS One 2015; 10:e0114497. [PMID: 25837450 PMCID: PMC4383525 DOI: 10.1371/journal.pone.0114497] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 11/10/2014] [Indexed: 11/25/2022] Open
Abstract
Background The grades of recommendation, assessment, development and evaluation (GRADE) approach is widely implemented in systematic reviews, health technology assessment and guideline development organisations throughout the world. A key advantage to this approach is that it aids transparency regarding judgments on the quality of evidence. However, the intricacies of making judgments about research methodology and evidence make the GRADE system complex and challenging to apply without training. Methods We have developed a semi-automated quality assessment tool (SAQAT) l based on GRADE. This is informed by responses by reviewers to checklist questions regarding characteristics that may lead to unreliability. These responses are then entered into the Bayesian network to ascertain the probabilities of risk of bias, inconsistency, indirectness, imprecision and publication bias conditional on review characteristics. The model then combines these probabilities to provide a probability for each of the GRADE overall quality categories. We tested the model using a range of plausible scenarios that guideline developers or review authors could encounter. Results Overall, the model reproduced GRADE judgements for a range of scenarios. Potential advantages over standard assessment are use of explicit and consistent weightings for different review characteristics, forcing consideration of important but sometimes neglected characteristics and principled downgrading where small but important probabilities of downgrading are accrued across domains. Conclusions Bayesian networks have considerable potential for use as tools to assess the validity of research evidence. The key strength of such networks lies in the provision of a statistically coherent method for combining probabilities across a complex framework based on both belief and evidence. In addition to providing tools for less experienced users to implement reliability assessment, the potential for sensitivity analyses and automation may be beneficial for application and the methodological development of reliability tools.
Collapse
Affiliation(s)
- Gavin B. Stewart
- Centre for Reviews and Dissemination, University of York, York, United Kingdom
- * E-mail:
| | - Julian P. T. Higgins
- Centre for Reviews and Dissemination, University of York, York, United Kingdom
- School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom
| | - Holger Schünemann
- Department of Clinical Epidemiology & Biostatistics, McMaster University Health Sciences Centre, Hamilton, ON, Canada
| | - Nick Meader
- Centre for Reviews and Dissemination, University of York, York, United Kingdom
| |
Collapse
|
27
|
Li T, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innovations in data collection, management, and archiving for systematic reviews. Ann Intern Med 2015; 162:287-94. [PMID: 25686168 DOI: 10.7326/m14-1603] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Data abstraction is a key step in conducting systematic reviews because data collected from study reports form the basis of appropriate conclusions. Recent methodological standards and expectations highlight several principles for data collection. To support implementation of these standards, this article provides a step-by-step tutorial for selecting data collection tools; constructing data collection forms; and abstracting, managing, and archiving data for systematic reviews. Examples are drawn from recent experience using the Systematic Review Data Repository for data collection and management. If it is done well, data collection for systematic reviews only needs to be done by 1 team and placed into a publicly accessible database for future use. Technological innovations, such as the Systematic Review Data Repository, will contribute to finding trustworthy answers for many health and health care questions.
Collapse
Affiliation(s)
- Tianjing Li
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - S. Swaroop Vedula
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Nira Hadar
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Christopher Parkin
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Joseph Lau
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Kay Dickersin
- From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| |
Collapse
|
28
|
Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 2015. [PMID: 25588314 DOI: 10.1186/2046‐4053‐4‐5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. METHODS Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. RESULTS The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). CONCLUSIONS Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously. The use of text mining to eliminate studies automatically should be considered promising, but not yet fully proven. In highly technical/clinical areas, it may be used with a high degree of confidence; but more developmental and evaluative work is needed in other disciplines.
Collapse
|
29
|
O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 2015; 4:5. [PMID: 25588314 PMCID: PMC4320539 DOI: 10.1186/2046-4053-4-5] [Citation(s) in RCA: 262] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2014] [Accepted: 12/10/2014] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. METHODS Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. RESULTS The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). CONCLUSIONS Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously. The use of text mining to eliminate studies automatically should be considered promising, but not yet fully proven. In highly technical/clinical areas, it may be used with a high degree of confidence; but more developmental and evaluative work is needed in other disciplines.
Collapse
Affiliation(s)
- Alison O’Mara-Eves
- />Evidence for Policy and Practice Information and Coordinating (EPPI)-Centre, Social Science Research Unit, UCL Institute of Education, University of London, London, UK
| | - James Thomas
- />Evidence for Policy and Practice Information and Coordinating (EPPI)-Centre, Social Science Research Unit, UCL Institute of Education, University of London, London, UK
| | - John McNaught
- />The National Centre for Text Mining and School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| | - Makoto Miwa
- />Toyota Technological Institute, 2-12-1 Hisakata, Tempaku-ku, Nagoya, 468-8511 Japan
| | - Sophia Ananiadou
- />The National Centre for Text Mining and School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| |
Collapse
|
30
|
Miwa M, Thomas J, O'Mara-Eves A, Ananiadou S. Reducing systematic review workload through certainty-based screening. J Biomed Inform 2014; 51:242-53. [PMID: 24954015 PMCID: PMC4199186 DOI: 10.1016/j.jbi.2014.06.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Revised: 06/04/2014] [Accepted: 06/07/2014] [Indexed: 11/19/2022]
Abstract
In systematic reviews, the growing number of published studies imposes a significant screening workload on reviewers. Active learning is a promising approach to reduce the workload by automating some of the screening decisions, but it has been evaluated for a limited number of disciplines. The suitability of applying active learning to complex topics in disciplines such as social science has not been studied, and the selection of useful criteria and enhancements to address the data imbalance problem in systematic reviews remains an open problem. We applied active learning with two criteria (certainty and uncertainty) and several enhancements in both clinical medicine and social science (specifically, public health) areas, and compared the results in both. The results show that the certainty criterion is useful for finding relevant documents, and weighting positive instances is promising to overcome the data imbalance problem in both data sets. Latent dirichlet allocation (LDA) is also shown to be promising when little manually-assigned information is available. Active learning is effective in complex topics, although its efficiency is limited due to the difficulties in text classification. The most promising criterion and weighting method are the same regardless of the review topic, and unsupervised techniques like LDA have a possibility to boost the performance of active learning without manual annotation.
Collapse
Affiliation(s)
- Makoto Miwa
- The National Centre for Text Mining and School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK; Toyota Technological Institute, 2-12-1 Hisakata, Tempaku-ku, Nagoya 468-8511, Japan.
| | - James Thomas
- Evidence for Policy and Practice Information and Coordinating (EPPI-)Centre, Social Science Research Unit, Institute of Education, University of London, London, UK.
| | - Alison O'Mara-Eves
- Evidence for Policy and Practice Information and Coordinating (EPPI-)Centre, Social Science Research Unit, Institute of Education, University of London, London, UK.
| | - Sophia Ananiadou
- The National Centre for Text Mining and School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK.
| |
Collapse
|
31
|
Li T, Saldanha IJ, Vedula SS, Yu T, Rosman L, Twose C, N Goodman S, Dickersin K. Learning by doing-teaching systematic review methods in 8 weeks. Res Synth Methods 2014; 5:254-63. [PMID: 26052850 DOI: 10.1002/jrsm.1111] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 12/30/2013] [Accepted: 01/06/2014] [Indexed: 11/06/2022]
Abstract
OBJECTIVE The objective of this paper is to describe the course "Systematic Reviews and Meta-analysis" at the Johns Hopkins Bloomberg School of Public Health. METHODS A distinct feature of our course is a group project in which students, assigned to multi-disciplinary groups, conduct a systematic review. In-class sessions comprise didactic lectures, hands-on exercises, demonstrations, discussion, and group work. Students also work outside of class to complete the systematic review. Students evaluated the course at the end of the term. We also surveyed students from 2004 to 2012 to learn more about the long-term impact of the course. RESULTS The course has been offered to more than 800 students since 1995. In our view, aspects that worked well include the hands-on approach, students working in a multidisciplinary group, intensive interaction with the teaching team, moving to an online approach, and continuous updates of the course content. A persistent issue is the constraint of time. 193 of 211 (91%) survey participants reported that the course is currently useful or as having an impact on their work. CONCLUSIONS Our experiences have led us to remain committed to a hands-on approach. Our course serves as a bridge between classroom learning and real-world practice, and provides an example of teaching systematic review.
Collapse
Affiliation(s)
- Tianjing Li
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Ian J Saldanha
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - S Swaroop Vedula
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Tsung Yu
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Lori Rosman
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Claire Twose
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Steven N Goodman
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Kay Dickersin
- Center for Clinical Trials, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| |
Collapse
|
32
|
Elliott JH, Turner T, Clavisi O, Thomas J, Higgins JPT, Mavergames C, Gruen RL. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLoS Med 2014; 11:e1001603. [PMID: 24558353 PMCID: PMC3928029 DOI: 10.1371/journal.pmed.1001603] [Citation(s) in RCA: 304] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The current difficulties in keeping systematic reviews up to date leads to considerable inaccuracy, hampering the translation of knowledge into action. Incremental advances in conventional review updating are unlikely to lead to substantial improvements in review currency. A new approach is needed. We propose living systematic review as a contribution to evidence synthesis that combines currency with rigour to enhance the accuracy and utility of health evidence. Living systematic reviews are high quality, up-to-date online summaries of health research, updated as new research becomes available, and enabled by improved production efficiency and adherence to the norms of scholarly communication. Together with innovations in primary research reporting and the creation and use of evidence in health systems, living systematic review contributes to an emerging evidence ecosystem.
Collapse
Affiliation(s)
- Julian H. Elliott
- Department of Infectious Diseases, Alfred Hospital and Monash University, Melbourne, Australia
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
- * E-mail:
| | - Tari Turner
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
- World Vision Australia, Melbourne, Australia
| | - Ornella Clavisi
- National Trauma Research Institute, Alfred Hospital, Melbourne, Australia
| | - James Thomas
- EPPI-Centre, Institute of Education, University of London, London, England
| | - Julian P. T. Higgins
- School of Social and Community Medicine, University of Bristol, Bristol, England
- Centre for Reviews and Dissemination, University of York, York, England
| | - Chris Mavergames
- Informatics and Knowledge Management Department, The Cochrane Collaboration, Freiburg, Germany
| | - Russell L. Gruen
- National Trauma Research Institute, Alfred Hospital, Melbourne, Australia
- Department of Surgery, Monash University, Melbourne, Australia
| |
Collapse
|
33
|
Elliott J, Sim I, Thomas J, Owens N, Dooley G, Riis J, Wallace B, Thomas J, Noel-Storr A, Rada G, Struthers C, Howe T, MacLehose H, Brandt L, Kunnamo I, Mavergames C. #CochraneTech: technology and the future of systematic reviews. Cochrane Database Syst Rev 2014; 2014:ED000091. [PMID: 25288182 PMCID: PMC10845870 DOI: 10.1002/14651858.ed000091] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
34
|
Shemilt I, Simon A, Hollands GJ, Marteau TM, Ogilvie D, O'Mara-Eves A, Kelly MP, Thomas J. Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. Res Synth Methods 2013; 5:31-49. [PMID: 26054024 DOI: 10.1002/jrsm.1093] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2013] [Revised: 06/10/2013] [Accepted: 06/29/2013] [Indexed: 02/03/2023]
Abstract
In scoping reviews, boundaries of relevant evidence may be initially fuzzy, with refined conceptual understanding of interventions and their proposed mechanisms of action an intended output of the scoping process rather than its starting point. Electronic searches are therefore sensitive, often retrieving very large record sets that are impractical to screen in their entirety. This paper describes methods for applying and evaluating the use of text mining (TM) technologies to reduce impractical screening workload in reviews, using examples of two extremely large-scale scoping reviews of public health evidence (choice architecture (CA) and economic environment (EE)). Electronic searches retrieved >800,000 (CA) and >1 million (EE) records. TM technologies were used to prioritise records for manual screening. TM performance was measured prospectively. TM reduced manual screening workload by 90% (CA) and 88% (EE) compared with conventional screening (absolute reductions of ≈430 000 (CA) and ≈378 000 (EE) records). This study expands an emerging corpus of empirical evidence for the use of TM to expedite study selection in reviews. By reducing screening workload to manageable levels, TM made it possible to assemble and configure large, complex evidence bases that crossed research discipline boundaries. These methods are transferable to other scoping and systematic reviews incorporating conceptual development or explanatory dimensions.
Collapse
Affiliation(s)
- Ian Shemilt
- Behaviour and Health Research Unit, University of Cambridge, Cambridge, UK
| | - Antonia Simon
- Thomas Coram Research Unit, Department of Children and Health, Institute of Education, London, UK
| | - Gareth J Hollands
- Behaviour and Health Research Unit, University of Cambridge, Cambridge, UK
| | - Theresa M Marteau
- Behaviour and Health Research Unit, University of Cambridge, Cambridge, UK
| | - David Ogilvie
- Behaviour and Health Research Unit, University of Cambridge, Cambridge, UK
| | - Alison O'Mara-Eves
- Evidence for Policy and Practice Information and Co-ordinating Centre, Department of Children and Health, Institute of Education, London, UK
| | - Michael P Kelly
- Centre for Public Health, National Institute for Health and Care Excellence, London, UK
| | - James Thomas
- Evidence for Policy and Practice Information and Co-ordinating Centre, Department of Children and Health, Institute of Education, London, UK
| |
Collapse
|
35
|
Wallace BC, Dahabreh IJ, Schmid CH, Lau J, Trikalinos TA. Modernizing the systematic review process to inform comparative effectiveness: tools and methods. J Comp Eff Res 2013; 2:273-82. [PMID: 24236626 DOI: 10.2217/cer.13.17] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Systematic reviews are being increasingly used to inform all levels of healthcare, from bedside decisions to policy-making. Since they are designed to minimize bias and subjectivity, they are a preferred option to assess the comparative effectiveness and safety of healthcare interventions. However, producing systematic reviews and keeping them up-to-date is becoming increasingly onerous for three reasons. First, the body of biomedical literature is expanding exponentially with no indication of slowing down. Second, as systematic reviews gain wide acceptance, they are also being used to address more complex questions (e.g., evaluating the comparative effectiveness of many interventions together rather than focusing only on pairs of interventions). Third, the standards for performing systematic reviews have become substantially more rigorous over time. To address these challenges, we must carefully prioritize the questions that should be addressed by systematic reviews and optimize the processes of research synthesis. In addition to reducing the workload involved in planning and conducting systematic reviews, we also need to make efforts to increase the transparency, reliability and validity of the review process; these aims can be grouped under the umbrella of 'modernization' of the systematic review process.
Collapse
Affiliation(s)
- Byron C Wallace
- Center for Evidence-Based Medicine, Program in Public Health, Brown University, Providence, RI 02906, USA
| | | | | | | | | |
Collapse
|
36
|
Lill CM, Bertram L. Developing the "next generation" of genetic association databases for complex diseases. Hum Mutat 2012; 33:1366-72. [PMID: 22752977 DOI: 10.1002/humu.22149] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 06/06/2012] [Indexed: 11/10/2022]
Abstract
Tens of thousands of genetic association studies investigating the influence of common polymorphisms on disease susceptibility have been published to date. These include ∼1,000 genome-wide association studies (GWAS). This vast amount of data in the field of complex genetics is becoming increasingly difficult to follow and interpret. It can be expected that the situation will become even more complex with the advent of association projects using "next-generation" technologies. One of the aims of the Human Variome Project is to concatenate such data in meaningful ways, for example, within the context of publicly available field synopses. Here, we present various examples of online genetic association databases developed by our group for neuropsychiatric disorders. One integral part of this model is the systematic inclusion of data from large-scale genotyping projects, for example, GWAS, while respecting the privacy of data contributors. We believe that our database approach may serve as a viable model that can be readily applied to other fields and ultimately improve our understanding of the genetic forces driving common human conditions.
Collapse
Affiliation(s)
- Christina M Lill
- Neurospsychiatric Genetics Group, Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | | |
Collapse
|
37
|
Abstract
Three articles in this issue of Genetics in Medicine describe examples of "knowledge integration," involving methods for generating and synthesizing rapidly emerging information on health-related genomic technologies and engaging stakeholders around the evidence. Knowledge integration, the central process in translating genomic research, involves three closely related, iterative components: knowledge management, knowledge synthesis, and knowledge translation. Knowledge management is the ongoing process of obtaining, organizing, and displaying evolving evidence. For example, horizon scanning and "infoveillance" use emerging technologies to scan databases, registries, publications, and cyberspace for information on genomic applications. Knowledge synthesis is the process of conducting systematic reviews using a priori rules of evidence. For example, methods including meta-analysis, decision analysis, and modeling can be used to combine information from basic, clinical, and population research. Knowledge translation refers to stakeholder engagement and brokering to influence policy, guidelines and recommendations, as well as the research agenda to close knowledge gaps. The ultrarapid production of information requires adequate public and private resources for knowledge integration to support the evidence-based development of genomic medicine.
Collapse
Affiliation(s)
- Muin J Khoury
- Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, GA, USA.
| | | | | | | |
Collapse
|