1
|
Byrne F, Hofstee L, Teijema J, De Bruin J, van de Schoot R. Impact of Active learning model and prior knowledge on discovery time of elusive relevant papers: a simulation study. Syst Rev 2024; 13:175. [PMID: 38978084 PMCID: PMC11232241 DOI: 10.1186/s13643-024-02587-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 06/14/2024] [Indexed: 07/10/2024] Open
Abstract
Software that employs screening prioritization through active learning (AL) has accelerated the screening process significantly by ranking an unordered set of records by their predicted relevance. However, failing to find a relevant paper might alter the findings of a systematic review, highlighting the importance of identifying elusive papers. The time to discovery (TD) measures how many records are needed to be screened to find a relevant paper, making it a helpful tool for detecting such papers. The main aim of this project was to investigate how the choice of the model and prior knowledge influence the TD values of the hard-to-find relevant papers and their rank orders. A simulation study was conducted, mimicking the screening process on a dataset containing titles, abstracts, and labels used for an already published systematic review. The results demonstrated that AL model choice, and mostly the choice of the feature extractor but not the choice of prior knowledge, significantly influenced the TD values and the rank order of the elusive relevant papers. Future research should examine the characteristics of elusive relevant papers to discover why they might take a long time to be found.
Collapse
Affiliation(s)
- Fionn Byrne
- Department of Information and Computing Science, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Laura Hofstee
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands
| | - Jelle Teijema
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands
| | - Jonathan De Bruin
- Research and Data Management Services, Utrecht University, Utrecht, The Netherlands
| | - Rens van de Schoot
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
2
|
Tóth B, Berek L, Gulácsi L, Péntek M, Zrubka Z. Automation of systematic reviews of biomedical literature: a scoping review of studies indexed in PubMed. Syst Rev 2024; 13:174. [PMID: 38978132 PMCID: PMC11229257 DOI: 10.1186/s13643-024-02592-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 06/20/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND The demand for high-quality systematic literature reviews (SRs) for evidence-based medical decision-making is growing. SRs are costly and require the scarce resource of highly skilled reviewers. Automation technology has been proposed to save workload and expedite the SR workflow. We aimed to provide a comprehensive overview of SR automation studies indexed in PubMed, focusing on the applicability of these technologies in real world practice. METHODS In November 2022, we extracted, combined, and ran an integrated PubMed search for SRs on SR automation. Full-text English peer-reviewed articles were included if they reported studies on SR automation methods (SSAM), or automated SRs (ASR). Bibliographic analyses and knowledge-discovery studies were excluded. Record screening was performed by single reviewers, and the selection of full text papers was performed in duplicate. We summarized the publication details, automated review stages, automation goals, applied tools, data sources, methods, results, and Google Scholar citations of SR automation studies. RESULTS From 5321 records screened by title and abstract, we included 123 full text articles, of which 108 were SSAM and 15 ASR. Automation was applied for search (19/123, 15.4%), record screening (89/123, 72.4%), full-text selection (6/123, 4.9%), data extraction (13/123, 10.6%), risk of bias assessment (9/123, 7.3%), evidence synthesis (2/123, 1.6%), assessment of evidence quality (2/123, 1.6%), and reporting (2/123, 1.6%). Multiple SR stages were automated by 11 (8.9%) studies. The performance of automated record screening varied largely across SR topics. In published ASR, we found examples of automated search, record screening, full-text selection, and data extraction. In some ASRs, automation fully complemented manual reviews to increase sensitivity rather than to save workload. Reporting of automation details was often incomplete in ASRs. CONCLUSIONS Automation techniques are being developed for all SR stages, but with limited real-world adoption. Most SR automation tools target single SR stages, with modest time savings for the entire SR process and varying sensitivity and specificity across studies. Therefore, the real-world benefits of SR automation remain uncertain. Standardizing the terminology, reporting, and metrics of study reports could enhance the adoption of SR automation techniques in real-world practice.
Collapse
Affiliation(s)
- Barbara Tóth
- Doctoral School of Innovation Management, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary
| | - László Berek
- Doctoral School for Safety and Security, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary
- University Library, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary
| | - László Gulácsi
- HECON Health Economics Research Center, University Research, and Innovation Center, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary
| | - Márta Péntek
- HECON Health Economics Research Center, University Research, and Innovation Center, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary
| | - Zsombor Zrubka
- HECON Health Economics Research Center, University Research, and Innovation Center, Óbuda University, Bécsi út 96/B, Budapest, 1034, Hungary.
| |
Collapse
|
3
|
Soares A, Schilling LM, Richardson J, Kommadi B, Subbian V, Dehnbostel J, Shahin K, Robinson KA, Afzal M, Lehmann HP, Kunnamo I, Alper BS. Making Science Computable Using Evidence-Based Medicine on Fast Healthcare Interoperability Resources: Standards Development Project. J Med Internet Res 2024; 26:e54265. [PMID: 38916936 PMCID: PMC11234056 DOI: 10.2196/54265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 04/04/2024] [Accepted: 04/06/2024] [Indexed: 06/26/2024] Open
Abstract
BACKGROUND Evidence-based medicine (EBM) has the potential to improve health outcomes, but EBM has not been widely integrated into the systems used for research or clinical decision-making. There has not been a scalable and reusable computer-readable standard for distributing research results and synthesized evidence among creators, implementers, and the ultimate users of that evidence. Evidence that is more rapidly updated, synthesized, disseminated, and implemented would improve both the delivery of EBM and evidence-based health care policy. OBJECTIVE This study aimed to introduce the EBM on Fast Healthcare Interoperability Resources (FHIR) project (EBMonFHIR), which is extending the methods and infrastructure of Health Level Seven (HL7) FHIR to provide an interoperability standard for the electronic exchange of health-related scientific knowledge. METHODS As an ongoing process, the project creates and refines FHIR resources to represent evidence from clinical studies and syntheses of those studies and develops tools to assist with the creation and visualization of FHIR resources. RESULTS The EBMonFHIR project created FHIR resources (ie, ArtifactAssessment, Citation, Evidence, EvidenceReport, and EvidenceVariable) for representing evidence. The COVID-19 Knowledge Accelerator (COKA) project, now Health Evidence Knowledge Accelerator (HEvKA), took this work further and created FHIR resources that express EvidenceReport, Citation, and ArtifactAssessment concepts. The group is (1) continually refining FHIR resources to support the representation of EBM; (2) developing controlled terminology related to EBM (ie, study design, statistic type, statistical model, and risk of bias); and (3) developing tools to facilitate the visualization and data entry of EBM information into FHIR resources, including human-readable interfaces and JSON viewers. CONCLUSIONS EBMonFHIR resources in conjunction with other FHIR resources can support relaying EBM components in a manner that is interoperable and consumable by downstream tools and health information technology systems to support the users of evidence.
Collapse
Affiliation(s)
- Andrey Soares
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Lisa M Schilling
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Joshua Richardson
- Center for Informatics, Research Triangle Institute International, Berkeley, CA, United States
| | - Bhagvan Kommadi
- Quantica Computacao, Hyderabad, India
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
| | - Vignesh Subbian
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- College of Public Health, Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, United States
| | - Joanne Dehnbostel
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Computable Publishing LLC, Franklin, NC, United States
| | - Khalid Shahin
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Computable Publishing LLC, Franklin, NC, United States
| | - Karen A Robinson
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Muhammad Afzal
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Department of Computing and Data Science, Birmingham City University, England, United Kingdom
| | - Harold P Lehmann
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Ilkka Kunnamo
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Duodecim Publishing Company Ltd, Helsinki, Finland
| | - Brian S Alper
- Scientific Knowledge Accelerator Foundation, Franklin, NC, United States
- Computable Publishing LLC, Franklin, NC, United States
| |
Collapse
|
4
|
Rogers M, Sutton A, Campbell F, Whear R, Bethel A, Coon JT. Streamlining search methods to update evidence and gap maps: A case study using intergenerational interventions. CAMPBELL SYSTEMATIC REVIEWS 2024; 20:e1380. [PMID: 38188228 PMCID: PMC10771710 DOI: 10.1002/cl2.1380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/01/2023] [Accepted: 12/14/2023] [Indexed: 01/09/2024]
Abstract
Background Evidence and Gap Maps (EGMs) should be regularly updated. Running update searches to find new studies for EGMs can be a time-consuming process. Search Summary Tables (SSTs) can help streamline searches by identifying which resources were most lucrative for identifying relevant articles, and which were redundant. The aim of this study was to use an SST to streamline search methods for an EGM of studies about intergenerational activities. Methods To produce the EGM, 15 databases were searched. 8638 records were screened and 500 studies were included in the final EGM. Using an SST, we determined which databases and search methods were the most efficient in terms of sensitivity and specificity for finding the included studies. We also investigated whether any database performed particularly well for returning particular study types. For the best performing databases we analysed the search terms used to streamline the strategies. Results No single database returned all of the studies included in the EGM. Out of 500 studies PsycINFO returned 40% (n = 202), CINAHL 39% (n = 194), Ageline 25% (n = 174), MEDLINE 23% (n = 117), ERIC 20% (n = 100) and Embase 19% (n = 98). HMIC database and Conference Proceedings Citation Index-Science via Web of Science returned no studies that were included in the EGM. ProQuest Dissertations & Theses (PQDT) returned the highest number of unique studies (n = 42), followed by ERIC (n = 33) and Ageline (n = 29). Ageline returned the most randomised controlled trials (42%) followed by CINAHL (34%), MEDLINE (29%) and CENTRAL (29%). CINAHL, Ageline, MEDLINE and PsycINFO performed the best for locating systematic reviews. (62%, 46% and 42% respectively). CINAHL, PsycINFO and Ageline performed best for qualitative studies (41%, 40% and 34%). The Journal of Intergenerational Relationships returned more included studies than any other journal (16%). No combinations of search terms were found to be better in terms of balancing specificity and sensitivity than the original search strategies. However, strategies could be reduced considerably in terms of length without losing key, unique studies. Conclusion Using SSTs we have developed a method for streamlining update searches for an EGM about intergenerational activities. For future updates we recommend that MEDLINE, PsycINFO, ERIC, Ageline, CINAHL and PQDT are searched. These searches should be supplemented by hand-searching the Journal of Intergenerational Relationships and carrying out backwards citation chasing on new systematic reviews. Using SSTs to analyse database efficiency could be a useful method to help streamline search updates for other EGMs.
Collapse
Affiliation(s)
- Morwenna Rogers
- Evidence Synthesis Team, NIHR ARC South West Peninsula (PenARC)University of Exeter Medical SchoolExeterUK
| | - Anthea Sutton
- SCHARR, University of Sheffield, Regent CourtSheffieldUK
| | - Fiona Campbell
- Population Health Sciences InstituteNewcastle UniversityNewcastleUK
| | - Rebecca Whear
- Evidence Synthesis Team, NIHR ARC South West Peninsula (PenARC)University of Exeter Medical SchoolExeterUK
| | - Alison Bethel
- Evidence Synthesis Team, NIHR ARC South West Peninsula (PenARC)University of Exeter Medical SchoolExeterUK
| | - Jo Thompson Coon
- Evidence Synthesis Team, NIHR ARC South West Peninsula (PenARC)University of Exeter Medical SchoolExeterUK
| |
Collapse
|
5
|
Bidonde J, Meneses-Echavez JF, Hafstad E, Brunborg GS, Bang L. Methods, strategies, and incentives to increase response to mental health surveys among adolescents: a systematic review. BMC Med Res Methodol 2023; 23:270. [PMID: 37974067 PMCID: PMC10652438 DOI: 10.1186/s12874-023-02096-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND This systematic review aimed to identify effective methods to increase adolescents' response to surveys about mental health and substance use, to improve the quality of survey information. METHODS We followed a protocol and searched for studies that compared different survey delivery modes to adolescents. Eligible studies reported response rates, mental health score variation per survey mode and participant variations in mental health scores. We searched CENTRAL, PsycINFO, MEDLINE and Scopus in May 2022, and conducted citation searches in June 2022. Two reviewers independently undertook study selection, data extraction, and risk of bias assessments. Following the assessment of heterogeneity, some studies were pooled using meta-analysis. RESULTS Fifteen studies were identified, reporting six comparisons related to survey methods and strategies. Results indicate that response rates do not differ between survey modes (e.g., web versus paper-and-pencil) delivered in classroom settings. However, web surveys may yield higher response rates outside classroom settings. The largest effects on response rates were achieved using unconditional monetary incentives and obtaining passive parental consent. Survey mode influenced mental health scores in certain comparisons. CONCLUSIONS Despite the mixed quality of the studies, the low volume for some comparisons and the limit to studies in high income countries, several effective methods and strategies to improve adolescents' response rates to mental health surveys were identified.
Collapse
Affiliation(s)
- Julia Bidonde
- Division of Health Services, Norwegian Institute of Public Health, Oslo, Norway
| | - Jose F Meneses-Echavez
- Division of Health Services, Norwegian Institute of Public Health, Oslo, Norway
- Facultad de Cultura Física, Deporte, y Recreación, Universidad Santo Tomás, Bogotá, Colombia
| | - Elisabet Hafstad
- Division of Health Services, Norwegian Institute of Public Health, Oslo, Norway
| | - Geir Scott Brunborg
- Department of Child Health and Development, Norwegian Institute of Public Health, Oslo, Norway
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Lasse Bang
- Department of Child Health and Development, Norwegian Institute of Public Health, Oslo, Norway.
| |
Collapse
|
6
|
Meneses-Echavez JF, Chavez Guapo N, Loaiza-Betancur AF, Machado A, Bidonde J. Pulmonary rehabilitation for acute exacerbations of COPD: A systematic review. Respir Med 2023; 219:107425. [PMID: 37858727 DOI: 10.1016/j.rmed.2023.107425] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 10/04/2023] [Accepted: 10/07/2023] [Indexed: 10/21/2023]
Abstract
INTRODUCTION AND OBJECTIVES This systematic review summarized the evidence on the effects (benefits and harms) of pulmonary rehabilitation for individuals with acute exacerbations of chronic obstructive pulmonary disease (AECOPD). MATERIAL AND METHODS We included randomized controlled trials comparing pulmonary rehabilitation to either active interventions or usual care regardless of setting. In March 2022, we searched MEDLINE, Scopus, CENTRAL, CINAHL and Web of Sciences, and trial registries. Record screening, data extraction and risk of bias assessment were undertaken by two reviewers. We assessed the certainty of the evidence using the GRADE approach. RESULTS This systematic review included 18 studies (n = 1465), involving a combination of mixed settings (8 studies), inpatient settings (8 studies), and outpatient settings (2 studies). The studies were at high risk of performance, detection, and reporting biases. Compared to usual care, pulmonary rehabilitation probably improves AECOPD-related hospital readmissions (relative risk 0.56, 95% CI 0.36 to 0.86; moderate certainty evidence) and cardiovascular submaximal capacity (standardized mean difference 0.73, 95% CI 0.48 to 0.99; moderate certainty evidence). Low certainty evidence suggests that pulmonary rehabilitation may be beneficial on re-exacerbations, dyspnoea, and impact of disease. The evidence regarding the effects of pulmonary rehabilitation on health-related quality of life and mortality is very uncertain (very low certainty evidence). CONCLUSION Our results indicate that pulmonary rehabilitation may be an effective treatment option for individuals with AECOPD, irrespective of setting. Our certainty in this evidence base was limited due to small studies, heterogeneous rehabilitation programs, numerous methodological weaknesses, and a poor reporting of findings that were inconsistent with each other. Trialists should adhere to the latest reporting standards to strengthen this body of evidence. REGISTRATION The study protocol was registered in Open Science Framework (https://osf.io/amgbz/).
Collapse
Affiliation(s)
- Jose F Meneses-Echavez
- Norwegian Institute of Public Health, Oslo, Norway; Facultad de Cultura Física, Deporte y Recreación, Universidad Santo Tomás, Bogotá, Colombia.
| | - Nathaly Chavez Guapo
- Facultad de Cultura Física, Deporte y Recreación, Universidad Santo Tomás, Bogotá, Colombia.
| | - Andrés Felipe Loaiza-Betancur
- Instituto Universitario de Educación Física, Universidad de Antioquia, Medellín, Colombia; Grupo de Investigación en Entrenamiento Deportivo y Actividad Física para La Salud (GIEDAF), Universidad Santo Tomás, Tunja, Colombia.
| | - Ana Machado
- Respiratory Research and Rehabilitation Laboratory (Lab3R), School of Health Sciences (ESSUA), University of Aveiro, Aveiro, Portugal.
| | - Julia Bidonde
- Norwegian Institute of Public Health, Oslo, Norway; School of Rehabilitation Sciences, University of Saskatchewan, Canada.
| |
Collapse
|
7
|
Roth S, Wermer-Colan A. Machine Learning Methods for Systematic Reviews:: A Rapid Scoping Review. Dela J Public Health 2023; 9:40-47. [PMID: 38173960 PMCID: PMC10759980 DOI: 10.32481/djph.2023.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024] Open
Abstract
Objective At the forefront of machine learning research since its inception has been natural language processing, also known as text mining, referring to a wide range of statistical processes for analyzing textual data and retrieving information. In medical fields, text mining has made valuable contributions in unexpected ways, not least by synthesizing data from disparate biomedical studies. This rapid scoping review examines how machine learning methods for text mining can be implemented at the intersection of these disparate fields to improve the workflow and process of conducting systematic reviews in medical research and related academic disciplines. Methods The primary research question that this investigation asked, "what impact does the use of machine learning have on the methods used by systematic review teams to carry out the systematic review process, such as the precision of search strategies, unbiased article selection or data abstraction and/or analysis for systematic reviews and other comprehensive review types of similar methodology?" A literature search was conducted by a medical librarian utilizing multiple databases, a grey literature search and handsearching of the literature. The search was completed on December 4, 2020. Handsearching was done on an ongoing basis with an end date of April 14, 2023. Results The search yielded 23,190 studies after duplicates were removed. As a result, 117 studies (1.70%) met eligibility criteria for inclusion in this rapid scoping review. Conclusions There are several techniques and/or types of machine learning methods in development or that have already been fully developed to assist with the systematic review stages. Combined with human intelligence, these machine learning methods and tools provide promise for making the systematic review process more efficient, saving valuable time for systematic review authors, and increasing the speed in which evidence can be created and placed in the hands of decision makers and the public.
Collapse
Affiliation(s)
- Stephanie Roth
- Medical Librarian, Lewis B. Flinn Medical Library, ChristianaCare
| | - Alex Wermer-Colan
- Academic Director, Loretta C. Duckworth Scholars Studio, Temple University Libraries
| |
Collapse
|
8
|
Ferdinands G, Schram R, de Bruin J, Bagheri A, Oberski DL, Tummers L, Teijema JJ, van de Schoot R. Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records. Syst Rev 2023; 12:100. [PMID: 37340494 PMCID: PMC10280866 DOI: 10.1186/s13643-023-02257-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 05/16/2023] [Indexed: 06/22/2023] Open
Abstract
BACKGROUND Conducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify relevant publications as early as possible. The goal of this study is to gain a comprehensive understanding of active learning models for reducing the workload in systematic reviews through a simulation study. METHODS The simulation study mimics the process of a human reviewer screening records while interacting with an active learning model. Different active learning models were compared based on four classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two feature extraction strategies (TF-IDF and doc2vec). The performance of the models was compared for six systematic review datasets from different research areas. The evaluation of the models was based on the Work Saved over Sampling (WSS) and recall. Additionally, this study introduces two new statistics, Time to Discovery (TD) and Average Time to Discovery (ATD). RESULTS The models reduce the number of publications needed to screen by 91.7 to 63.9% while still finding 95% of all relevant records (WSS@95). Recall of the models was defined as the proportion of relevant records found after screening 10% of of all records and ranges from 53.6 to 99.8%. The ATD values range from 1.4% till 11.7%, which indicate the average proportion of labeling decisions the researcher needs to make to detect a relevant record. The ATD values display a similar ranking across the simulations as the recall and WSS values. CONCLUSIONS Active learning models for screening prioritization demonstrate significant potential for reducing the workload in systematic reviews. The Naive Bayes + TF-IDF model yielded the best results overall. The Average Time to Discovery (ATD) measures performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets.
Collapse
Affiliation(s)
- Gerbrich Ferdinands
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands.
| | - Raoul Schram
- Department of Research and Data Management Services, Information Technology Services, Utrecht University, Utrecht, The Netherlands
| | - Jonathan de Bruin
- Department of Research and Data Management Services, Information Technology Services, Utrecht University, Utrecht, The Netherlands
| | - Ayoub Bagheri
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands
| | - Daniel L Oberski
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands
| | - Lars Tummers
- School of Governance, Faculty of Law, Economics and Governance, Utrecht University, Utrecht, The Netherlands
| | - Jelle Jasper Teijema
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands
| | - Rens van de Schoot
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
9
|
Oliveira Dos Santos Á, Sergio da Silva E, Machado Couto L, Valadares Labanca Reis G, Silva Belo V. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform 2023; 142:104389. [PMID: 37187321 DOI: 10.1016/j.jbi.2023.104389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/11/2023] [Accepted: 05/08/2023] [Indexed: 05/17/2023]
Abstract
OBJECTIVE Evidence-based medicine (EBM) is a decision-making process based on the conscious and judicious use of the best available scientific evidence. However, the exponential increase in the amount of information currently available likely exceeds the capacity of human-only analysis. In this context, artificial intelligence (AI) and its branches such as machine learning (ML) can be used to facilitate human efforts in analyzing the literature to foster EBM. The present scoping review aimed to examine the use of AI in the automation of biomedical literature survey and analysis with a view to establishing the state-of-the-art and identifying knowledge gaps. MATERIALS AND METHODS Comprehensive searches of the main databases were performed for articles published up to June 2022 and studies were selected according to inclusion and exclusion criteria. Data were extracted from the included articles and the findings categorized. RESULTS The total number of records retrieved from the databases was 12,145, of which 273 were included in the review. Classification of the studies according to the use of AI in evaluating the biomedical literature revealed three main application groups, namely assembly of scientific evidence (n=127; 47%), mining the biomedical literature (n=112; 41%) and quality analysis (n=34; 12%). Most studies addressed the preparation of systematic reviews, while articles focusing on the development of guidelines and evidence synthesis were the least frequent. The biggest knowledge gap was identified within the quality analysis group, particularly regarding methods and tools that assess the strength of recommendation and consistency of evidence. CONCLUSION Our review shows that, despite significant progress in the automation of biomedical literature surveys and analyses in recent years, intense research is needed to fill knowledge gaps on more difficult aspects of ML, deep learning and natural language processing, and to consolidate the use of automation by end-users (biomedical researchers and healthcare professionals).
Collapse
Affiliation(s)
| | - Eduardo Sergio da Silva
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| | - Letícia Machado Couto
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| | | | - Vinícius Silva Belo
- Federal University of São João del-Rei, Campus Centro-Oeste Dona Lindu, Divinópolis, Minas Gerais, Brazil.
| |
Collapse
|
10
|
Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology. Syst Rev 2023; 12:1. [PMID: 36597132 PMCID: PMC9811792 DOI: 10.1186/s13643-022-02163-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 12/21/2022] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The importance of systematic reviews in collating and summarising available research output on a particular topic cannot be over-emphasized. However, initial screening of retrieved literature is significantly time and labour intensive. Attempts at automating parts of the systematic review process have been made with varying degree of success partly due to being domain-specific, requiring vendor-specific software or manually labelled training data. Our primary objective was to develop statistical methodology for performing automated title and abstract screening for systematic reviews. Secondary objectives included (1) to retrospectively apply the automated screening methodology to previously manually screened systematic reviews and (2) to characterize the performance of the automated screening methodology scoring algorithm in a simulation study. METHODS We implemented a Latent Dirichlet Allocation-based topic model to derive representative topics from the retrieved documents' title and abstract. The second step involves defining a score threshold for classifying the documents as relevant for full-text review or not. The score is derived based on a set of search keywords (often the database retrieval search terms). Two systematic review studies were retrospectively used to illustrate the methodology. RESULTS In one case study (helminth dataset), [Formula: see text] sensitivity compared to manual title and abstract screening was achieved. This is against a false positive rate of [Formula: see text]. For the second case study (Wilson disease dataset), a sensitivity of [Formula: see text] and specificity of [Formula: see text] were achieved. CONCLUSIONS Unsupervised title and abstract screening has the potential to reduce the workload involved in conducting systematic review. While sensitivity of the methodology on the tested data is low, approximately [Formula: see text] specificity was achieved. Users ought to keep in mind that potentially low sensitivity might occur. One approach to mitigate this might be to incorporate additional targeted search keywords such as the indexing databases terms into the search term copora. Moreover, automated screening can be used as an additional screener to the manual screeners.
Collapse
|
11
|
Uthman OA, Court R, Enderby J, Al-Khudairy L, Nduka C, Mistry H, Melendez-Torres GJ, Taylor-Phillips S, Clarke A. Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning. Health Technol Assess 2022:10.3310/UDIR6682. [PMID: 36562494 PMCID: PMC10068584 DOI: 10.3310/udir6682] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. METHODS We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network ( CNN ), stacked CNN , parallel-stacked CNN , recurrent neural network ( RNN ) and CNN-RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. RESULTS We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as 'relevant' and 15,935 (96%) were tagged as 'irrelevant'. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN , yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. FUTURE WORK AND LIMITATIONS We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. CONCLUSION Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. FUNDING This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
| | - Rachel Court
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Jodie Enderby
- Warwick Medical School, University of Warwick, Coventry, UK
| | | | - Chidozie Nduka
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Hema Mistry
- Warwick Medical School, University of Warwick, Coventry, UK
| | - G J Melendez-Torres
- Peninsula Technology Assessment Group (PenTAG), College of Medicine and Health, University of Exeter, Exeter, UK
| | | | - Aileen Clarke
- Warwick Medical School, University of Warwick, Coventry, UK
| |
Collapse
|
12
|
Facchinetti T, Benetti G, Giuffrida D, Nocera A. slr-kit: A semi-supervised machine learning framework for systematic literature reviews. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Sutton A, Campbell F. The ScHARR LMIC filter: Adapting a low- and middle-income countries geographic search filter to identify studies on preterm birth prevention and management. Res Synth Methods 2022; 13:447-456. [PMID: 35142432 PMCID: PMC9543249 DOI: 10.1002/jrsm.1552] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 01/26/2022] [Accepted: 01/31/2022] [Indexed: 11/11/2022]
Abstract
Search filters are used to find evidence on specific subjects. Performance of filters can be varied and may need adapting to meet the needs of research topics. There are limited geographic search filters available, and only one pertaining to low- and middle-income countries (LMICs). When searching for literature on preterm birth prevention and management in LMICs for a research project at the School of Health and Related Research (ScHARR), we made use of the Cochrane Effective Practice and Organisation of Care (EPOC) LMIC geographic search filter for the databases; Ovid MEDLINE, Ovid Embase, Cochrane Library. During screening following a broad scoping search in Ovid MEDLINE, it was found that the EPOC LMIC filter did not identify a relevant study. Adaptations were made to the LMIC geographic search filter to maximise retrieval and identify the missing study. Institution was included as a search field, and the search terms high burden or countdown countries were added. The filter was translated for the databases; Ovid Embase, Cochrane Library, Ovid PsycINFO, and CINAHL via EBSCO. The adapted ScHARR LMIC filter is a non-validated 1st generation filter which increases the sensitivity of the EPOC LMIC search filter. Validating the filter would confirm its retrieval performance and benefit information professionals, researchers, and health professionals. We recommend that the ScHARR LMIC filter is used to improve sensitivity of the Cochrane EPOC LMIC filter and reduce the risk of missing relevant studies.
Collapse
Affiliation(s)
- Anthea Sutton
- School of Health and Related ResearchThe University of SheffieldSheffieldUK
| | - Fiona Campbell
- School of Health and Related ResearchThe University of SheffieldSheffieldUK
| |
Collapse
|
14
|
Title of Project: A Novel Tool that Allows Interactive Screening of PubMed Citations Showed Promise for the Semi-Automation of Identification of Biomedical Literature. J Clin Epidemiol 2022; 150:63-71. [PMID: 35738306 DOI: 10.1016/j.jclinepi.2022.06.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 06/10/2022] [Accepted: 06/13/2022] [Indexed: 11/21/2022]
Abstract
OBJECTIVE Systematic reviews form the basis of evidence-based medicine but are expensive and time-consuming to produce. To address this burden, we have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps. STUDY DESIGN Pythia incorporates a set of natural-language questions with machine-learning algorithms to rank all PubMed citations based on relevance, returning the 100 top-ranked citations for human screening. The tagged citations are iteratively exploited by Pythia to refine the search and re-rank the citations. RESULTS Across seven systematic reviews, the ability of Pythia to identify the relevant citations (sensitivity) ranged from 0.09 to 0.58. The number of abstracts reviewed per relevant abstract (NNR) was lower than in the manually screened project in four reviews, higher in two, and had mixed results in one. The reviews that had greater overall sensitivity retrieved more relevant citations in early batches, but retrieval was generally unaffected by other aspects, such as study design, study size, and specific key question. CONCLUSIONS Due to its low sensitivity, Pythia is not ready for widespread use. Future research should explore ways to encode domain knowledge in query formulation to better enrich the questions used in the search.
Collapse
|
15
|
Yan H, Rahgozar A, Sethuram C, Karunananthan S, Archibald D, Bradley L, Hakimjavadi R, Helmer-Smith M, Jolin-Dahel K, McCutcheon T, Puncher J, Rezaiefar P, Shoppoff L, Liddy C. Natural Language Processing to Identify Digital Learning Tools in Postgraduate Family Medicine: Protocol for a Scoping Review. JMIR Res Protoc 2022; 11:e34575. [PMID: 35499861 PMCID: PMC9112078 DOI: 10.2196/34575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 01/24/2022] [Accepted: 03/21/2022] [Indexed: 02/06/2023] Open
Abstract
Background The COVID-19 pandemic has highlighted the growing need for digital learning tools in postgraduate family medicine training. Family medicine departments must understand and recognize the use and effectiveness of digital tools in order to integrate them into curricula and develop effective learning tools that fill gaps and meet the learning needs of trainees. Objective This scoping review will aim to explore and organize the breadth of knowledge regarding digital learning tools in family medicine training. Methods This scoping review follows the 6 stages of the methodological framework outlined first by Arksey and O’Malley, then refined by Levac et al, including a search of published academic literature in 6 databases (MEDLINE, ERIC, Education Source, Embase, Scopus, and Web of Science) and gray literature. Following title and abstract and full text screening, characteristics and main findings of the included studies and resources will be tabulated and summarized. Thematic analysis and natural language processing (NLP) will be conducted in parallel using a 9-step approach to identify common themes and synthesize the literature. Additionally, NLP will be employed for bibliometric and scientometric analysis of the identified literature. Results The search strategy has been developed and launched. As of October 2021, we have completed stages 1, 2, and 3 of the scoping review. We identified 132 studies for inclusion through the academic literature search and 127 relevant studies in the gray literature search. Further refinement of the eligibility criteria and data extraction has been ongoing since September 2021. Conclusions In this scoping review, we will identify and consolidate information and evidence related to the use and effectiveness of existing digital learning tools in postgraduate family medicine training. Our findings will improve the understanding of the current landscape of digital learning tools, which will be of great value to educators and trainees interested in using existing tools, innovators looking to design digital learning tools that meet current needs, and researchers involved in the study of digital tools. Trial Registration OSF Registries osf.io/wju4k; https://osf.io/wju4k International Registered Report Identifier (IRRID) DERR1-10.2196/34575
Collapse
Affiliation(s)
- Hui Yan
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
- Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Arya Rahgozar
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Claire Sethuram
- Bruyère Research Institute, Ottawa, ON, Canada
- Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Sathya Karunananthan
- Bruyère Research Institute, Ottawa, ON, Canada
- Interdisciplinary School of Health Sciences, University of Ottawa, Ottawa, ON, Canada
| | - Douglas Archibald
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
- Bruyère Research Institute, Ottawa, ON, Canada
| | - Lindsay Bradley
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Ramtin Hakimjavadi
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
- Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Mary Helmer-Smith
- School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada
| | | | | | - Jeffrey Puncher
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Parisa Rezaiefar
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
- Bruyère Research Institute, Ottawa, ON, Canada
| | - Lina Shoppoff
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Clare Liddy
- Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada
- Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
- Bruyère Research Institute, Ottawa, ON, Canada
| |
Collapse
|
16
|
Natural language processing applied to mental illness detection: a narrative review. NPJ Digit Med 2022; 5:46. [PMID: 35396451 PMCID: PMC8993841 DOI: 10.1038/s41746-022-00589-7] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 02/23/2022] [Indexed: 11/25/2022] Open
Abstract
Mental illness is highly prevalent nowadays, constituting a major cause of distress in people’s life with impact on society’s health and well-being. Mental illness is a complex multi-factorial disease associated with individual risk factors and a variety of socioeconomic, clinical associations. In order to capture these complex associations expressed in a wide variety of textual data, including social media posts, interviews, and clinical notes, natural language processing (NLP) methods demonstrate promising improvements to empower proactive mental healthcare and assist early diagnosis. We provide a narrative review of mental illness detection using NLP in the past decade, to understand methods, trends, challenges and future directions. A total of 399 studies from 10,467 records were included. The review reveals that there is an upward trend in mental illness detection NLP research. Deep learning methods receive more attention and perform better than traditional machine learning methods. We also provide some recommendations for future studies, including the development of novel detection methods, deep learning paradigms and interpretable models.
Collapse
|
17
|
Attar-Khorasani S, Chalmeta R. Internet of Things Data Visualization for Business Intelligence. BIG DATA 2022. [PMID: 35133879 DOI: 10.1089/big.2021.0200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This study contributes to the research on Internet of Things data visualization for business intelligence processes, an area of growing interest to scholars, by conducting a systematic review of the literature. A total of 237 articles published over the past 11 years were obtained and compared. This made it possible to identify the top contributing and most influential authors, countries, publishers, institutions, papers, and research findings, together with the challenges facing current research. Based on these results, this work provides a thorough insight into the field by proposing four research categories (Technology infrastructure, Case examples, Final-user experience, and Big Data tools), together with the development of these research streams over time and their future research directions.
Collapse
Affiliation(s)
- Sima Attar-Khorasani
- Grupo Integración y Re-Ingenieria de sistemas, Departamento de lenguajes y sistemas informáticos, Universitat Jaume I, Castellón, Spain
| | - Ricardo Chalmeta
- Grupo Integración y Re-Ingenieria de sistemas, Departamento de lenguajes y sistemas informáticos, Universitat Jaume I, Castellón, Spain
| |
Collapse
|
18
|
Abdelkader W, Navarro T, Parrish R, Cotoi C, Germini F, Linkins LA, Iorio A, Haynes RB, Ananiadou S, Chu L, Lokker C. A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation. JMIR Res Protoc 2021; 10:e29398. [PMID: 34847061 PMCID: PMC8669577 DOI: 10.2196/29398] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 08/24/2021] [Accepted: 09/17/2021] [Indexed: 11/16/2022] Open
Abstract
Background A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. Objective The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. Methods Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. Results Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. Conclusions The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. International Registered Report Identifier (IRRID) DERR1-10.2196/29398
Collapse
Affiliation(s)
- Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Lori-Ann Linkins
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - R Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada.,Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Sophia Ananiadou
- Department of Computer Science, University of Manchester, Manchester, United Kingdom.,The Alan Turing Institute, London, United Kingdom
| | - Lingyang Chu
- Department of Computing and Software, Faculty of Engineering, McMaster University, Hamilton, ON, Canada
| | - Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
19
|
van Altena AJ, Spijker R, Leeflang MMG, Olabarriaga SD. Training sample selection: Impact on screening automation in diagnostic test accuracy reviews. Res Synth Methods 2021; 12:831-841. [PMID: 34390193 PMCID: PMC9292892 DOI: 10.1002/jrsm.1518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 06/12/2021] [Accepted: 07/02/2021] [Indexed: 02/01/2023]
Abstract
When performing a systematic review, researchers screen the articles retrieved after a broad search strategy one by one, which is time‐consuming. Computerised support of this screening process has been applied with varying success. This is partly due to the dependency on large amounts of data to develop models that predict inclusion. In this paper, we present an approach to choose which data to use in model training and compare it with established approaches. We used a dataset of 50 Cochrane diagnostic test accuracy reviews, and each was used as a target review. From the remaining 49 reviews, we selected those that most closely resembled the target review's clinical topic using the cosine similarity metric. Included and excluded studies from these selected reviews were then used to develop our prediction models. The performance of models trained on the selected reviews was compared against models trained on studies from all available reviews. The prediction models performed best with a larger number of reviews in the training set and on target reviews that had a research subject similar to other reviews in the dataset. Our approach using cosine similarity may reduce computational costs for model training and the duration of the screening process.
Collapse
Affiliation(s)
- Allard J van Altena
- Department of Epidemiology and Data Science, Amsterdam Public Health, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - René Spijker
- Medical Library, Amsterdam Public Health, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Mariska M G Leeflang
- Department of Epidemiology and Data Science, Amsterdam Public Health, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Sílvia Delgado Olabarriaga
- Department of Epidemiology and Data Science, Amsterdam Public Health, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
20
|
van Haastrecht M, Sarhan I, Yigit Ozkan B, Brinkhuis M, Spruit M. SYMBALS: A Systematic Review Methodology Blending Active Learning and Snowballing. Front Res Metr Anal 2021; 6:685591. [PMID: 34124534 PMCID: PMC8193570 DOI: 10.3389/frma.2021.685591] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 05/12/2021] [Indexed: 11/28/2022] Open
Abstract
Research output has grown significantly in recent years, often making it difficult to see the forest for the trees. Systematic reviews are the natural scientific tool to provide clarity in these situations. However, they are protracted processes that require expertise to execute. These are problematic characteristics in a constantly changing environment. To solve these challenges, we introduce an innovative systematic review methodology: SYMBALS. SYMBALS blends the traditional method of backward snowballing with the machine learning method of active learning. We applied our methodology in a case study, demonstrating its ability to swiftly yield broad research coverage. We proved the validity of our method using a replication study, where SYMBALS was shown to accelerate title and abstract screening by a factor of 6. Additionally, four benchmarking experiments demonstrated the ability of our methodology to outperform the state-of-the-art systematic review methodology FAST2.
Collapse
Affiliation(s)
- Max van Haastrecht
- Department of Information and Computing Sciences, Utrecht University, Utrecht, Netherlands
| | - Injy Sarhan
- Department of Information and Computing Sciences, Utrecht University, Utrecht, Netherlands.,Department of Computer Engineering, Arab Academy for Science, Technology and Maritime Transport (AASTMT), Alexandria, Egypt
| | - Bilge Yigit Ozkan
- Department of Information and Computing Sciences, Utrecht University, Utrecht, Netherlands
| | - Matthieu Brinkhuis
- Department of Information and Computing Sciences, Utrecht University, Utrecht, Netherlands
| | - Marco Spruit
- Department of Information and Computing Sciences, Utrecht University, Utrecht, Netherlands.,Department of Public Health and Primary Care, Leiden University Medical Center (LUMC), Leiden, Netherlands.,Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, Netherlands
| |
Collapse
|
21
|
Zhou S, Kan P, Huang Q, Silbernagel J. A guided latent Dirichlet allocation approach to investigate real-time latent topics of Twitter data during Hurricane Laura. J Inf Sci 2021. [DOI: 10.1177/01655515211007724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Natural disasters cause significant damage, casualties and economical losses. Twitter has been used to support prompt disaster response and management because people tend to communicate and spread information on public social media platforms during disaster events. To retrieve real-time situational awareness (SA) information from tweets, the most effective way to mine text is using natural language processing (NLP). Among the advanced NLP models, the supervised approach can classify tweets into different categories to gain insight and leverage useful SA information from social media data. However, high-performing supervised models require domain knowledge to specify categories and involve costly labelling tasks. This research proposes a guided latent Dirichlet allocation (LDA) workflow to investigate temporal latent topics from tweets during a recent disaster event, the 2020 Hurricane Laura. With integration of prior knowledge, a coherence model, LDA topics visualisation and validation from official reports, our guided approach reveals that most tweets contain several latent topics during the 10-day period of Hurricane Laura. This result indicates that state-of-the-art supervised models have not fully utilised tweet information because they only assign each tweet a single label. In contrast, our model can not only identify emerging topics during different disaster events but also provides multilabel references to the classification schema. In addition, our results can help to quickly identify and extract SA information to responders, stakeholders and the general public so that they can adopt timely responsive strategies and wisely allocate resource during Hurricane events.
Collapse
Affiliation(s)
- Sulong Zhou
- Nelson Institute for Environmental Studies, University of Wisconsin–Madison, USA; Department of Computer Sciences, University of Wisconsin–Madison, USA
| | - Pengyu Kan
- Department of Computer Sciences, University of Wisconsin–Madison, USA
| | - Qunying Huang
- Department of Geography, University of Wisconsin–Madison, USA
| | - Janet Silbernagel
- Nelson Institute for Environmental Studies, University of Wisconsin–Madison, USA; Department of Planning and Landscape Architecture, University of Wisconsin–Madison, USA
| |
Collapse
|
22
|
Chai KEK, Lines RLJ, Gucciardi DF, Ng L. Research Screener: a machine learning tool to semi-automate abstract screening for systematic reviews. Syst Rev 2021; 10:93. [PMID: 33795003 PMCID: PMC8017894 DOI: 10.1186/s13643-021-01635-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Systematic reviews and meta-analyses provide the highest level of evidence to help inform policy and practice, yet their rigorous nature is associated with significant time and economic demands. The screening of titles and abstracts is the most time consuming part of the review process with analysts required review thousands of articles manually, taking on average 33 days. New technologies aimed at streamlining the screening process have provided initial promising findings, yet there are limitations with current approaches and barriers to the widespread use of these tools. In this paper, we introduce and report initial evidence on the utility of Research Screener, a semi-automated machine learning tool to facilitate abstract screening. METHODS Three sets of analyses (simulation, interactive and sensitivity) were conducted to provide evidence of the utility of the tool through both simulated and real-world examples. RESULTS Research Screener delivered a workload saving of between 60 and 96% across nine systematic reviews and two scoping reviews. Findings from the real-world interactive analysis demonstrated a time saving of 12.53 days compared to the manual screening, which equates to a financial saving of USD 2444. Conservatively, our results suggest that analysts who scan 50% of the total pool of articles identified via a systematic search are highly likely to have identified 100% of eligible papers. CONCLUSIONS In light of these findings, Research Screener is able to reduce the burden for researchers wishing to conduct a comprehensive systematic review without reducing the scientific rigour for which they strive to achieve.
Collapse
Affiliation(s)
- Kevin E K Chai
- Curtin Institute for Computation, Curtin University, Perth, Australia
- School of Population Health, Curtin University, Perth, Australia
| | - Robin L J Lines
- School of Allied Health, Curtin University, Perth, Australia
| | | | - Leo Ng
- School of Allied Health, Curtin University, Perth, Australia.
| |
Collapse
|
23
|
van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, Kramer B, Huijts M, Hoogerwerf M, Ferdinands G, Harkema A, Willemsen J, Ma Y, Fang Q, Hindriks S, Tummers L, Oberski DL. An open source machine learning framework for efficient and transparent systematic reviews. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-020-00287-7] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
AbstractTo help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool to accelerate the step of screening titles and abstracts. For many tasks—including but not limited to systematic reviews and meta-analyses—the scientific literature needs to be checked systematically. Scholars and practitioners currently screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that active learning can yield far more efficient reviewing than manual reviewing while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice.
Collapse
|
24
|
Creating enriched training sets of eligible studies for large systematic reviews: the utility of PubMed's Best Match algorithm. Int J Technol Assess Health Care 2020; 37:e7. [PMID: 33336640 DOI: 10.1017/s0266462320002159] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
INTRODUCTION Solutions like crowd screening and machine learning can assist systematic reviewers with heavy screening burdens but require training sets containing a mix of eligible and ineligible studies. This study explores using PubMed's Best Match algorithm to create small training sets containing at least five relevant studies. METHODS Six systematic reviews were examined retrospectively. MEDLINE searches were converted and run in PubMed. The ranking of included studies was studied under both Best Match and Most Recent sort conditions. RESULTS Retrieval sizes for the systematic reviews ranged from 151 to 5,406 records and the numbers of relevant records ranged from 8 to 763. The median ranking of relevant records was higher in Best Match for all six reviews, when compared with Most Recent sort. Best Match placed a total of thirty relevant records in the first fifty, at least one for each systematic review. Most Recent sorting placed only ten relevant records in the first fifty. Best Match sorting outperformed Most Recent in all cases and placed five or more relevant records in the first fifty in three of six cases. DISCUSSION Using a predetermined set size such as fifty may not provide enough true positives for an effective systematic review training set. However, screening PubMed records ranked by Best Match and continuing until the desired number of true positives are identified is efficient and effective. CONCLUSIONS The Best Match sort in PubMed improves the ranking and increases the proportion of relevant records in the first fifty records relative to sorting by recency.
Collapse
|
25
|
|
26
|
Carvallo A, Parra D, Lobel H, Soto A. Automatic document screening of medical literature using word and text embeddings in an active learning setting. Scientometrics 2020. [DOI: 10.1007/s11192-020-03648-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
27
|
Callaghan MW, Müller-Hansen F. Statistical stopping criteria for automated screening in systematic reviews. Syst Rev 2020; 9:273. [PMID: 33248464 PMCID: PMC7700715 DOI: 10.1186/s13643-020-01521-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 11/05/2020] [Indexed: 11/10/2022] Open
Abstract
Active learning for systematic review screening promises to reduce the human effort required to identify relevant documents for a systematic review. Machines and humans work together, with humans providing training data, and the machine optimising the documents that the humans screen. This enables the identification of all relevant documents after viewing only a fraction of the total documents. However, current approaches lack robust stopping criteria, so that reviewers do not know when they have seen all or a certain proportion of relevant documents. This means that such systems are hard to implement in live reviews. This paper introduces a workflow with flexible statistical stopping criteria, which offer real work reductions on the basis of rejecting a hypothesis of having missed a given recall target with a given level of confidence. The stopping criteria are shown on test datasets to achieve a reliable level of recall, while still providing work reductions of on average 17%. Other methods proposed previously are shown to provide inconsistent recall and work reductions across datasets.
Collapse
Affiliation(s)
- Max W Callaghan
- Mercator Research Institute on Global Commons and Climate Change, EUREF Campus 19, Torgauer Straße 12-15, Berlin, 10829 Germany
- Priestley International Centre for Climate, University of Leeds, Leeds, LS2 9JT UK
- Potsdam Institute for Climate Impact Research (PIK), Member of the Leibniz Association, P.O. Box 60 12 03, Potsdam, 14412 Germany
| | - Finn Müller-Hansen
- Mercator Research Institute on Global Commons and Climate Change, EUREF Campus 19, Torgauer Straße 12-15, Berlin, 10829 Germany
- Potsdam Institute for Climate Impact Research (PIK), Member of the Leibniz Association, P.O. Box 60 12 03, Potsdam, 14412 Germany
| |
Collapse
|
28
|
Alharbi A, Stevenson M. Refining Boolean queries to identify relevant studies for systematic review updates. J Am Med Inform Assoc 2020; 27:1658-1666. [PMID: 33067630 PMCID: PMC7750994 DOI: 10.1093/jamia/ocaa148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 06/09/2020] [Accepted: 06/23/2020] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Systematic reviews are important in health care but are expensive to produce and maintain. The authors explore the use of automated transformations of Boolean queries to improve the identification of relevant studies for updates to systematic reviews. MATERIALS AND METHODS A set of query transformations, including operator substitution, query expansion, and query reduction, were used to iteratively modify the Boolean query used for the original systematic review. The most effective transformation at each stage is identified using information about the studies included and excluded from the original review. A dataset consisting of 22 systematic reviews was used for evaluation. Updated queries were evaluated using the included and excluded studies from the updated version of the review. Recall and precision were used as evaluation measures. RESULTS The updated queries were more effective than the ones used for the original review, in terms of both precision and recall. The overall number of documents retrieved was reduced by more than half, while the number of relevant documents found increased by 10.3%. CONCLUSIONS Identification of relevant studies for updates to systematic reviews can be carried out more effectively by using information about the included and excluded studies from the original review to produce improved Boolean queries. These updated queries reduce the overall number of documents retrieved while also increasing the number of relevant documents identified, thereby representing a considerable reduction in effort required by systematic reviewers.
Collapse
Affiliation(s)
- Amal Alharbi
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| | - Mark Stevenson
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
29
|
Deng Z, Yin K, Bao Y, Armengol VD, Wang C, Tiwari A, Barzilay R, Parmigiani G, Braun D, Hughes KS. Validation of a Semiautomated Natural Language Processing-Based Procedure for Meta-Analysis of Cancer Susceptibility Gene Penetrance. JCO Clin Cancer Inform 2020; 3:1-9. [PMID: 31419182 DOI: 10.1200/cci.19.00043] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Quantifying the risk of cancer associated with pathogenic mutations in germline cancer susceptibility genes-that is, penetrance-enables the personalization of preventive management strategies. Conducting a meta-analysis is the best way to obtain robust risk estimates. We have previously developed a natural language processing (NLP) -based abstract classifier which classifies abstracts as relevant to penetrance, prevalence of mutations, both, or neither. In this work, we evaluate the performance of this NLP-based procedure. MATERIALS AND METHODS We compared the semiautomated NLP-based procedure, which involves automated abstract classification and text mining, followed by human review of identified studies, with the traditional procedure that requires human review of all studies. Ten high-quality gene-cancer penetrance meta-analyses spanning 16 gene-cancer associations were used as the gold standard by which to evaluate the performance of our procedure. For each meta-analysis, we evaluated the number of abstracts that required human review (workload) and the ability to identify the studies that were included by the authors in their quantitative analysis (coverage). RESULTS Compared with the traditional procedure, the semiautomated NLP-based procedure led to a lower workload across all 10 meta-analyses, with an overall 84% reduction (2,774 abstracts v 16,941 abstracts) in the amount of human review required. Overall coverage was 93%-we are able to identify 132 of 142 studies-before reviewing references of identified studies. Reasons for the 10 missed studies included blank and poorly written abstracts. After reviewing references, nine of the previously missed studies were identified and coverage improved to 99% (141 of 142 studies). CONCLUSION We demonstrated that an NLP-based procedure can significantly reduce the review workload without compromising the ability to identify relevant studies. NLP algorithms have promising potential for reducing human efforts in the literature review process.
Collapse
Affiliation(s)
| | - Kanhua Yin
- Massachusetts General Hospital, Boston, MA
| | - Yujia Bao
- Massachusetts Institute of Technology, Boston, MA
| | | | - Cathy Wang
- Harvard TH Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | | | | | - Giovanni Parmigiani
- Harvard TH Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | - Danielle Braun
- Harvard TH Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | - Kevin S Hughes
- Massachusetts General Hospital, Boston, MA.,Harvard Medical School, Boston, MA
| |
Collapse
|
30
|
Bao Y, Deng Z, Wang Y, Kim H, Armengol VD, Acevedo F, Ouardaoui N, Wang C, Parmigiani G, Barzilay R, Braun D, Hughes KS. Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes. JCO Clin Cancer Inform 2020; 3:1-9. [PMID: 31545655 DOI: 10.1200/cci.19.00042] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
PURPOSE The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools that help to monitor and prioritize the literature to understand the clinical implications of pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance-risk of cancer for germline mutation carriers-or prevalence of germline genetic mutations. MATERIALS AND METHODS We conducted literature searches in PubMed and retrieved paper titles and abstracts to create an annotated data set for training and evaluating the two machine learning classification models. Our first model is a support vector machine (SVM) which learns a linear decision rule on the basis of the bag-of-ngrams representation of each title and abstract. Our second model is a convolutional neural network (CNN) which learns a complex nonlinear decision rule on the basis of the raw title and abstract. We evaluated the performance of the two models on the classification of papers as relevant to penetrance or prevalence. RESULTS For penetrance classification, we annotated 3,740 paper titles and abstracts and evaluated the two models using 10-fold cross-validation. The SVM model achieved 88.93% accuracy-percentage of papers that were correctly classified-whereas the CNN model achieved 88.53% accuracy. For prevalence classification, we annotated 3,753 paper titles and abstracts. The SVM model achieved 88.92% accuracy and the CNN model achieved 88.52% accuracy. CONCLUSION Our models achieve high accuracy in classifying abstracts as relevant to penetrance or prevalence. By facilitating literature review, this tool could help clinicians and researchers keep abreast of the burgeoning knowledge of gene-cancer associations and keep the knowledge bases for clinical decision support tools up to date.
Collapse
Affiliation(s)
- Yujia Bao
- Massachusetts Institute of Technology, Boston, MA
| | | | - Yan Wang
- Massachusetts General Hospital, Boston, MA
| | - Heeyoon Kim
- Massachusetts Institute of Technology, Boston, MA
| | | | | | | | - Cathy Wang
- Harvard T.H. Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | - Giovanni Parmigiani
- Harvard T.H. Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | | | - Danielle Braun
- Harvard T.H. Chan School of Public Health, Boston, MA.,Dana-Farber Cancer Institute, Boston, MA
| | - Kevin S Hughes
- Massachusetts General Hospital, Boston, MA.,Harvard Medical School, Boston, MA
| |
Collapse
|
31
|
Cho I, Lee M, Kim Y. What are the main patient safety concerns of healthcare stakeholders: a mixed-method study of Web-based text. Int J Med Inform 2020; 140:104162. [PMID: 32416430 PMCID: PMC7198194 DOI: 10.1016/j.ijmedinf.2020.104162] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 03/20/2020] [Accepted: 04/28/2020] [Indexed: 12/19/2022]
Abstract
Due to the importance of safety in quality care, it’s national policy should be created using a bottom-up approach from various healthcare stakeholders. To explore latent concerns of consumers, providers, government bodies, and researchers, text data analysis on patient safety collected from websites was useful for summarizing various aspects of concern. A common concern among stakeholders was hospital infection control, ranging from nosocomial infections to those brought in by visiting patients around the Patient Safety Act legislation of Korea in 2015. Researchers were focused on hospital sociocultural factors at both the organizational and clinician levels. Government policies and systemic approaches to patient safety were highlighted by different stakeholders. Five topics including infection control showed statistically significant increasing trends over time, while another five showed decreasing trends.
Objectives Various healthcare stakeholders define quality of care in different ways. Public policy could advocate all these concerns. This study was conducted to identify the main themes on patient safety of stakeholders expressed before and after the Patient Safety Act was enacted in Korea in 2015. Design Longitudinal observational study of the interests of healthcare stakeholders generated between January 2014 and September 2018. Materials and methods Text data were collected from 2,487 documents on 18 websites that were identified as representative healthcare stakeholder groups of consumers, providers, government, and researchers. A Korean natural language processing (NLP) package, manual review, and synonym dictionary were used for data preprocessing, and we adopted the unsupervised NLP method of probabilistic topic modeling and latent Dirichlet allocation. A linear trend analysis over time, a qualitative step involving two external experts, and original text reviews were performed to validate the identified topics. Results Forty-one topics were identified, and the most common concerns of stakeholders were institutional infection control as triggered by the Middle East respiratory syndrome outbreak in early 2015, and infusion-related infection from late 2017 until the middle of 2018. The other top-three concerns of the stakeholder groups were highly similar, while research topics were limited to the perceptions of providers and the activities and culture of hospitals. Five topics showed statistically significant increasing trends over time, while another five showed decreasing trends (both P < 0.05). In the qualitative step, we confirmed 35 themes and revised the other 6. Conclusions A common concern among stakeholders was hospital infection control, ranging from nosocomial infections to those brought in by family visiting patients. Government policies and systemic approaches to patient safety were highlighted by different stakeholders. Researchers were focused on hospital sociocultural factors at both the organizational and clinician levels. These identified concerns all should be advocated by the public health policy.
Collapse
Affiliation(s)
- Insook Cho
- Department of Nursing, Inha University, Incheon, South Korea.
| | - Minyoung Lee
- Department of Nursing, Inha University, Incheon, South Korea; Graduate School, Inha University, Incheon, South Korea
| | - Yeonjin Kim
- Graduate School, Inha University, Incheon, South Korea
| |
Collapse
|
32
|
Howard BE, Phillips J, Tandon A, Maharana A, Elmore R, Mav D, Sedykh A, Thayer K, Merrick BA, Walker V, Rooney A, Shah RR. SWIFT-Active Screener: Accelerated document screening through active learning and integrated recall estimation. ENVIRONMENT INTERNATIONAL 2020; 138:105623. [PMID: 32203803 PMCID: PMC8082972 DOI: 10.1016/j.envint.2020.105623] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 02/13/2020] [Accepted: 02/28/2020] [Indexed: 05/19/2023]
Abstract
BACKGROUND In the screening phase of systematic review, researchers use detailed inclusion/exclusion criteria to decide whether each article in a set of candidate articles is relevant to the research question under consideration. A typical review may require screening thousands or tens of thousands of articles in and can utilize hundreds of person-hours of labor. METHODS Here we introduce SWIFT-Active Screener, a web-based, collaborative systematic review software application, designed to reduce the overall screening burden required during this resource-intensive phase of the review process. To prioritize articles for review, SWIFT-Active Screener uses active learning, a type of machine learning that incorporates user feedback during screening. Meanwhile, a negative binomial model is employed to estimate the number of relevant articles remaining in the unscreened document list. Using a simulation involving 26 diverse systematic review datasets that were previously screened by reviewers, we evaluated both the document prioritization and recall estimation methods. RESULTS On average, 95% of the relevant articles were identified after screening only 40% of the total reference list. In the 5 document sets with 5,000 or more references, 95% recall was achieved after screening only 34% of the available references, on average. Furthermore, the recall estimator we have proposed provides a useful, conservative estimate of the percentage of relevant documents identified during the screening process. CONCLUSION SWIFT-Active Screener can result in significant time savings compared to traditional screening and the savings are increased for larger project sizes. Moreover, the integration of explicit recall estimation during screening solves an important challenge faced by all machine learning systems for document screening: when to stop screening a prioritized reference list. The software is currently available in the form of a multi-user, collaborative, online web application.
Collapse
Affiliation(s)
| | | | | | | | | | - Deepak Mav
- Sciome LLC, 2 Davis Drive Durham, NC 27709, USA
| | - Alex Sedykh
- Sciome LLC, 2 Davis Drive Durham, NC 27709, USA
| | - Kristina Thayer
- Integrated Risk Information System (IRIS) Division, Environmental Protection Agency, 109 T.W. Alexander Drive RTP, NC 27709, USA
| | - B Alex Merrick
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | - Vickie Walker
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | - Andrew Rooney
- National Toxicology Program (NTP)/National Institute of Environmental Health Sciences (NIEHS), 111 T.W. Alexander Drive RTP, NC 27709, USA
| | | |
Collapse
|
33
|
Lee EW, Wallace BC, Galaviz KI, Ho JC. MMiDaS-AE: Multi-modal Missing Data aware Stacked Autoencoder for Biomedical Abstract Screening. PROCEEDINGS OF THE ACM CONFERENCE ON HEALTH, INFERENCE, AND LEARNING 2020; 2020:139-150. [PMID: 34308444 PMCID: PMC8297409 DOI: 10.1145/3368555.3384463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Systematic review (SR) is an essential process to identify, evaluate, and summarize the findings of all relevant individual studies concerning health-related questions. However, conducting a SR is labor-intensive, as identifying relevant studies is a daunting process that entails multiple researchers screening thousands of articles for relevance. In this paper, we propose MMiDaS-AE, a Multi-modal Missing Data aware Stacked Autoencoder, for semi-automating screening for SRs. We use a multi-modal view that exploits three representations, of: 1) documents, 2) topics, and 3) citation networks. Documents that contain similar words will be nearby in the document embedding space. Models can also exploit the relationship between documents and the associated SR MeSH terms to capture article relevancy. Finally, related works will likely share the same citations, and thus closely related articles would, intuitively, be trained to be close to each other in the embedding space. However, using all three learned representations as features directly result in an unwieldy number of parameters. Thus, motivated by recent work on multi-modal auto-encoders, we adopt a multi-modal stacked autoencoder that can learn a shared representation encoding all three representations in a compressed space. However, in practice one or more of these modalities may be missing for an article (e.g., if we cannot recover citation information). Therefore, we propose to learn to impute the shared representation even when specific inputs are missing. We find this new model significantly improves performance on a dataset consisting of 15 SRs compared to existing approaches.
Collapse
|
34
|
How Many Papers Should Scientists Be Reviewing? An Analysis Using Verified Peer Review Reports. PUBLICATIONS 2020. [DOI: 10.3390/publications8010004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The current peer review system is under stress from ever increasing numbers of publications, the proliferation of open-access journals and an apparent difficulty in obtaining high-quality reviews in due time. At its core, this issue may be caused by scientists insufficiently prioritising reviewing. Perhaps this low prioritisation is due to a lack of understanding on how many reviews need to be conducted by researchers to balance the peer review process. I obtained verified peer review data from 142 journals across 12 research fields, for a total of over 300,000 reviews and over 100,000 publications, to determine an estimate of the numbers of reviews required per publication per field. I then used this value in relation to the mean numbers of authors per publication per field to highlight a ‘review ratio’: the expected minimum number of publications an author in their field should review to balance their input (publications) into the peer review process. On average, 3.49 ± 1.45 (SD) reviews were required for each scientific publication, and the estimated review ratio across all fields was 0.74 ± 0.46 (SD) reviews per paper published per author. Since these are conservative estimates, I recommend scientists aim to conduct at least one review per publication they produce. This should ensure that the peer review system continues to function as intended.
Collapse
|
35
|
Lanera C, Berchialla P, Sharma A, Minto C, Gregori D, Baldi I. Screening PubMed abstracts: is class imbalance always a challenge to machine learning? Syst Rev 2019; 8:317. [PMID: 31810495 PMCID: PMC6896747 DOI: 10.1186/s13643-019-1245-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 11/25/2019] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The growing number of medical literature and textual data in online repositories led to an exponential increase in the workload of researchers involved in citation screening for systematic reviews. This work aims to combine machine learning techniques and data preprocessing for class imbalance to identify the outperforming strategy to screen articles in PubMed for inclusion in systematic reviews. METHODS We trained four binary text classifiers (support vector machines, k-nearest neighbor, random forest, and elastic-net regularized generalized linear models) in combination with four techniques for class imbalance: random undersampling and oversampling with 50:50 and 35:65 positive to negative class ratios and none as a benchmark. We used textual data of 14 systematic reviews as case studies. Difference between cross-validated area under the receiver operating characteristic curve (AUC-ROC) for machine learning techniques with and without preprocessing (delta AUC) was estimated within each systematic review, separately for each classifier. Meta-analytic fixed-effect models were used to pool delta AUCs separately by classifier and strategy. RESULTS Cross-validated AUC-ROC for machine learning techniques (excluding k-nearest neighbor) without preprocessing was prevalently above 90%. Except for k-nearest neighbor, machine learning techniques achieved the best improvement in conjunction with random oversampling 50:50 and random undersampling 35:65. CONCLUSIONS Resampling techniques slightly improved the performance of the investigated machine learning techniques. From a computational perspective, random undersampling 35:65 may be preferred.
Collapse
Affiliation(s)
- Corrado Lanera
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac Thoracic Vascular Sciences and Public Health, University of Padova, Via Loredan, 18, 35131, Padova, Italy
| | - Paola Berchialla
- Department of Clinical and Biological Sciences, University of Torino, Torino, Italy
| | - Abhinav Sharma
- Department of Biological Sciences and Bioengineering, Indian Institute of Technology Kanpur, Kanpur, India
| | - Clara Minto
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac Thoracic Vascular Sciences and Public Health, University of Padova, Via Loredan, 18, 35131, Padova, Italy
| | - Dario Gregori
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac Thoracic Vascular Sciences and Public Health, University of Padova, Via Loredan, 18, 35131, Padova, Italy
| | - Ileana Baldi
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac Thoracic Vascular Sciences and Public Health, University of Padova, Via Loredan, 18, 35131, Padova, Italy.
| |
Collapse
|
36
|
Brockmeier AJ, Ju M, Przybyła P, Ananiadou S. Improving reference prioritisation with PICO recognition. BMC Med Inform Decis Mak 2019; 19:256. [PMID: 31805934 PMCID: PMC6896258 DOI: 10.1186/s12911-019-0992-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 11/22/2019] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Machine learning can assist with multiple tasks during systematic reviews to facilitate the rapid retrieval of relevant references during screening and to identify and extract information relevant to the study characteristics, which include the PICO elements of patient/population, intervention, comparator, and outcomes. The latter requires techniques for identifying and categorising fragments of text, known as named entity recognition. METHODS A publicly available corpus of PICO annotations on biomedical abstracts is used to train a named entity recognition model, which is implemented as a recurrent neural network. This model is then applied to a separate collection of abstracts for references from systematic reviews within biomedical and health domains. The occurrences of words tagged in the context of specific PICO contexts are used as additional features for a relevancy classification model. Simulations of the machine learning-assisted screening are used to evaluate the work saved by the relevancy model with and without the PICO features. Chi-squared and statistical significance of positive predicted values are used to identify words that are more indicative of relevancy within PICO contexts. RESULTS Inclusion of PICO features improves the performance metric on 15 of the 20 collections, with substantial gains on certain systematic reviews. Examples of words whose PICO context are more precise can explain this increase. CONCLUSIONS Words within PICO tagged segments in abstracts are predictive features for determining inclusion. Combining PICO annotation model into the relevancy classification pipeline is a promising approach. The annotations may be useful on their own to aid users in pinpointing necessary information for data extraction, or to facilitate semantic search.
Collapse
Affiliation(s)
- Austin J. Brockmeier
- National Centre of Text Mining, School of Computer Science, University of Manchester, Princess Street, Manchester, M1 7DN UK
- University of Delaware, 139 The Green, Newark, Delaware, 19716 USA
| | - Meizhi Ju
- National Centre of Text Mining, School of Computer Science, University of Manchester, Princess Street, Manchester, M1 7DN UK
| | - Piotr Przybyła
- National Centre of Text Mining, School of Computer Science, University of Manchester, Princess Street, Manchester, M1 7DN UK
- Linguistic Engineering Group, Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, Warszawa, 01-248 Poland
| | - Sophia Ananiadou
- National Centre of Text Mining, School of Computer Science, University of Manchester, Princess Street, Manchester, M1 7DN UK
- The Alan Turing Institute, 96 Euston Road, London, NW1 2DB UK
| |
Collapse
|
37
|
Hollands GJ, Carter P, Anwer S, King SE, Jebb SA, Ogilvie D, Shemilt I, Higgins JPT, Marteau TM. Altering the availability or proximity of food, alcohol, and tobacco products to change their selection and consumption. Cochrane Database Syst Rev 2019; 9:CD012573. [PMID: 31482606 PMCID: PMC6953356 DOI: 10.1002/14651858.cd012573.pub3] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
BACKGROUND Overconsumption of food, alcohol, and tobacco products increases the risk of non-communicable diseases. Interventions to change characteristics of physical micro-environments where people may select or consume these products - including shops, restaurants, workplaces, and schools - are of considerable public health policy and research interest. This review addresses two types of intervention within such environments: altering the availability (the range and/or amount of options) of these products, or their proximity (the distance at which they are positioned) to potential consumers. OBJECTIVES 1. To assess the impact on selection and consumption of altering the availability or proximity of (a) food (including non-alcoholic beverages), (b) alcohol, and (c) tobacco products.2. To assess the extent to which the impact of these interventions is modified by characteristics of: i. studies, ii. interventions, and iii. SEARCH METHODS We searched CENTRAL, MEDLINE, Embase, PsycINFO, and seven other published or grey literature databases, as well as trial registries and key websites, up to 23 July 2018, followed by citation searches. SELECTION CRITERIA We included randomised controlled trials with between-participants (parallel group) or within-participants (cross-over) designs. Eligible studies compared effects of exposure to at least two different levels of availability of a product or its proximity, and included a measure of selection or consumption of the manipulated product. DATA COLLECTION AND ANALYSIS We used a novel semi-automated screening workflow and applied standard Cochrane methods to select eligible studies, collect data, and assess risk of bias. In separate analyses for availability interventions and proximity interventions, we combined results using random-effects meta-analysis and meta-regression models to estimate summary effect sizes (as standardised mean differences (SMDs)) and to investigate associations between summary effect sizes and selected study, intervention, or participant characteristics. We rated the certainty of evidence for each outcome using GRADE. MAIN RESULTS We included 24 studies, with the majority (20/24) giving concerns about risk of bias. All of the included studies investigated food products; none investigated alcohol or tobacco. The majority were conducted in laboratory settings (14/24), with adult participants (17/24), and used between-participants designs (19/24). All studies were conducted in high-income countries, predominantly in the USA (14/24).Six studies investigated availability interventions, of which two changed the absolute number of different options available, and four altered the relative proportion of less-healthy (to healthier) options. Most studies (4/6) manipulated snack foods or drinks. For selection outcomes, meta-analysis of three comparisons from three studies (n = 154) found that exposure to fewer options resulted in a large reduction in selection of the targeted food(s): SMD -1.13 (95% confidence interval (CI) -1.90 to -0.37) (low certainty evidence). For consumption outcomes, meta-analysis of three comparisons from two studies (n = 150) found that exposure to fewer options resulted in a moderate reduction in consumption of those foods, but with considerable uncertainty: SMD -0.55 (95% CI -1.27 to 0.18) (low certainty evidence).Eighteen studies investigated proximity interventions. Most (14/18) changed the distance at which a snack food or drink was placed from the participants, whilst four studies changed the order of meal components encountered along a line. For selection outcomes, only one study with one comparison (n = 41) was identified, which found that food placed farther away resulted in a moderate reduction in its selection: SMD -0.65 (95% CI -1.29 to -0.01) (very low certainty evidence). For consumption outcomes, meta-analysis of 15 comparisons from 12 studies (n = 1098) found that exposure to food placed farther away resulted in a moderate reduction in its consumption: SMD -0.60 (95% CI -0.84 to -0.36) (low certainty evidence). Meta-regression analyses indicated that this effect was greater: the farther away the product was placed; when only the targeted product(s) was available; when participants were of low deprivation status; and when the study was at high risk of bias. AUTHORS' CONCLUSIONS The current evidence suggests that changing the number of available food options or altering the positioning of foods could contribute to meaningful changes in behaviour, justifying policy actions to promote such changes within food environments. However, the certainty of this evidence as assessed by GRADE is low or very low. To enable more certain and generalisable conclusions about these potentially important effects, further research is warranted in real-world settings, intervening across a wider range of foods - as well as alcohol and tobacco products - and over sustained time periods.
Collapse
Affiliation(s)
- Gareth J Hollands
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| | - Patrice Carter
- University College LondonCentre for Outcomes Research and Effectiveness1‐19 Torrington PlaceLondonUKWC1E 7HB
| | - Sumayya Anwer
- University of BristolPopulation Health Sciences, Bristol Medical SchoolCanynge Hall, 39 Whatley RoadBristolUKBS8 2PS
| | - Sarah E King
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| | - Susan A Jebb
- University of OxfordNuffield Department of Primary Care Health SciencesRadcliffe Observatory QuarterWoodstock RoadOxfordOxfordshireUKOX2 6GG
| | - David Ogilvie
- University of CambridgeMRC Epidemiology UnitBox 285Cambridge Biomedical CampusCambridgeUKCB2 0QQ
| | - Ian Shemilt
- University College LondonEPPI‐Centre10 Woburn SquareLondonUKWC1H 0NR
| | - Julian P T Higgins
- University of BristolPopulation Health Sciences, Bristol Medical SchoolCanynge Hall, 39 Whatley RoadBristolUKBS8 2PS
| | - Theresa M Marteau
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| | | |
Collapse
|
38
|
Hollands GJ, Carter P, Anwer S, King SE, Jebb SA, Ogilvie D, Shemilt I, Higgins JPT, Marteau TM. Altering the availability or proximity of food, alcohol, and tobacco products to change their selection and consumption. Cochrane Database Syst Rev 2019; 8:CD012573. [PMID: 31452193 PMCID: PMC6710643 DOI: 10.1002/14651858.cd012573.pub2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
BACKGROUND Overconsumption of food, alcohol, and tobacco products increases the risk of non-communicable diseases. Interventions to change characteristics of physical micro-environments where people may select or consume these products - including shops, restaurants, workplaces, and schools - are of considerable public health policy and research interest. This review addresses two types of intervention within such environments: altering the availability (the range and/or amount of options) of these products, or their proximity (the distance at which they are positioned) to potential consumers. OBJECTIVES 1. To assess the impact on selection and consumption of altering the availability or proximity of (a) food (including non-alcoholic beverages), (b) alcohol, and (c) tobacco products.2. To assess the extent to which the impact of these interventions is modified by characteristics of: i. studies, ii. interventions, and iii. SEARCH METHODS We searched CENTRAL, MEDLINE, Embase, PsycINFO, and seven other published or grey literature databases, as well as trial registries and key websites, up to 23 July 2018, followed by citation searches. SELECTION CRITERIA We included randomised controlled trials with between-participants (parallel group) or within-participants (cross-over) designs. Eligible studies compared effects of exposure to at least two different levels of availability of a product or its proximity, and included a measure of selection or consumption of the manipulated product. DATA COLLECTION AND ANALYSIS We used a novel semi-automated screening workflow and applied standard Cochrane methods to select eligible studies, collect data, and assess risk of bias. In separate analyses for availability interventions and proximity interventions, we combined results using random-effects meta-analysis and meta-regression models to estimate summary effect sizes (as standardised mean differences (SMDs)) and to investigate associations between summary effect sizes and selected study, intervention, or participant characteristics. We rated the certainty of evidence for each outcome using GRADE. MAIN RESULTS We included 24 studies, with the majority (20/24) giving concerns about risk of bias. All of the included studies investigated food products; none investigated alcohol or tobacco. The majority were conducted in laboratory settings (14/24), with adult participants (17/24), and used between-participants designs (19/24). All studies were conducted in high-income countries, predominantly in the USA (14/24).Six studies investigated availability interventions, of which two changed the absolute number of different options available, and four altered the relative proportion of less-healthy (to healthier) options. Most studies (4/6) manipulated snack foods or drinks. For selection outcomes, meta-analysis of three comparisons from three studies (n = 154) found that exposure to fewer options resulted in a large reduction in selection of the targeted food(s): SMD -1.13 (95% confidence interval (CI) -1.90 to -0.37) (low certainty evidence). For consumption outcomes, meta-analysis of three comparisons from two studies (n = 150) found that exposure to fewer options resulted in a moderate reduction in consumption of those foods, but with considerable uncertainty: SMD -0.55 (95% CI -1.27 to 0.18) (low certainty evidence).Eighteen studies investigated proximity interventions. Most (14/18) changed the distance at which a snack food or drink was placed from the participants, whilst four studies changed the order of meal components encountered along a line. For selection outcomes, only one study with one comparison (n = 41) was identified, which found that food placed farther away resulted in a moderate reduction in its selection: SMD -0.65 (95% CI -1.29 to -0.01) (very low certainty evidence). For consumption outcomes, meta-analysis of 15 comparisons from 12 studies (n = 1098) found that exposure to food placed farther away resulted in a moderate reduction in its consumption: SMD -0.60 (95% CI -0.84 to -0.36) (low certainty evidence). Meta-regression analyses indicated that this effect was greater: the farther away the product was placed; when only the targeted product(s) was available; when participants were of low deprivation status; and when the study was at high risk of bias. AUTHORS' CONCLUSIONS The current evidence suggests that changing the number of available food options or altering the positioning of foods could contribute to meaningful changes in behaviour, justifying policy actions to promote such changes within food environments. However, the certainty of this evidence as assessed by GRADE is low or very low. To enable more certain and generalisable conclusions about these potentially important effects, further research is warranted in real-world settings, intervening across a wider range of foods - as well as alcohol and tobacco products - and over sustained time periods.
Collapse
Affiliation(s)
- Gareth J Hollands
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| | - Patrice Carter
- University College LondonCentre for Outcomes Research and Effectiveness1‐19 Torrington PlaceLondonUKWC1E 7HB
| | - Sumayya Anwer
- University of BristolPopulation Health Sciences, Bristol Medical SchoolCanynge Hall, 39 Whatley RoadBristolUKBS8 2PS
| | - Sarah E King
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| | - Susan A Jebb
- University of OxfordNuffield Department of Primary Care Health SciencesRadcliffe Observatory QuarterWoodstock RoadOxfordUKOX2 6GG
| | - David Ogilvie
- University of CambridgeMRC Epidemiology UnitBox 285Cambridge Biomedical CampusCambridgeUKCB2 0QQ
| | - Ian Shemilt
- University College LondonEPPI‐Centre10 Woburn SquareLondonUKWC1H 0NR
| | - Julian P T Higgins
- University of BristolPopulation Health Sciences, Bristol Medical SchoolCanynge Hall, 39 Whatley RoadBristolUKBS8 2PS
| | - Theresa M Marteau
- University of CambridgeBehaviour and Health Research UnitForvie SiteRobinson WayCambridgeUKCB2 0SR
| |
Collapse
|
39
|
Bashir R, Surian D, Dunn AG. The risk of conclusion change in systematic review updates can be estimated by learning from a database of published examples. J Clin Epidemiol 2019; 110:42-49. [PMID: 30849512 DOI: 10.1016/j.jclinepi.2019.02.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Revised: 01/25/2019] [Accepted: 02/26/2019] [Indexed: 01/11/2023]
Abstract
OBJECTIVES To determine which systematic review characteristics are needed to estimate the risk of conclusion change in systematic review updates. STUDY DESIGN AND SETTING We applied classification trees (a machine learning method) to model the risk of conclusion change in systematic review updates, using pairs of systematic reviews and their updates as samples. The classifiers were constructed using a set of features extracted from systematic reviews and the relevant trials added in published updates. Model performance was measured by recall, precision, and area under the receiver operating characteristic curve (AUC). RESULTS We identified 63 pairs of systematic reviews and updates, of which 20 (32%) exhibited a change in conclusion in their updates. A classifier using information about new trials exhibited the highest performance (AUC: 0.71; recall: 0.75; precision: 0.43) compared to a classifier that used fewer features (AUC: 0.65; recall: 0.75; precision: 0.39). CONCLUSION When estimating the risk of conclusion change in systematic review updates, information about the sizes of trials that will be added in an update are most useful. Future tools aimed at signaling conclusion change risks would benefit from complementary tools that automate screening of relevant trials.
Collapse
Affiliation(s)
- Rabia Bashir
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, New South Wales 2109, Australia.
| | - Didi Surian
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Adam G Dunn
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, New South Wales 2109, Australia; Computational Health Informatics Program, Boston Children's Hospital, Boston, MA 02115, USA
| |
Collapse
|
40
|
Bashir R, Dunn AG. Software engineering principles address current problems in the systematic review ecosystem. J Clin Epidemiol 2019; 109:136-141. [PMID: 30582972 DOI: 10.1016/j.jclinepi.2018.12.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 11/04/2018] [Accepted: 12/17/2018] [Indexed: 12/19/2022]
Abstract
Systematic reviewers are simultaneously unable to produce systematic reviews fast enough to keep up with the availability of new trial evidence while overproducing systematic reviews that are unlikely to change practice because they are redundant or biased. Although the transparency and completeness of trial reporting has improved with changes in policy and new technologies, systematic reviews have not yet benefited from the same level of effort. We found that new methods and tools used to automate aspects of systematic review processes have focused on improving the efficiency of individual systematic reviews rather than the efficiency of the entire ecosystem of systematic review production. We use software engineering principles to review challenges and opportunities for improving the interoperability, integrity, efficiency, and maintainability. We conclude by recommending ways to improve access to structured systematic review results. Major opportunities for improving systematic reviews will come from new tools and changes in policy focused on doing the right systematic reviews rather than just doing more of them faster.
Collapse
Affiliation(s)
- Rabia Bashir
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
| | - Adam G Dunn
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| |
Collapse
|
41
|
Schmitz T, Bukowski M, Koschmieder S, Schmitz-Rode T, Farkas R. Potential Technologies Review: A hybrid information retrieval framework to accelerate demand-pull innovation in biomedical engineering. Res Synth Methods 2019; 10:420-439. [PMID: 30995361 DOI: 10.1002/jrsm.1350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 02/01/2019] [Accepted: 04/11/2019] [Indexed: 11/11/2022]
Affiliation(s)
- Tom Schmitz
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Mark Bukowski
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Steffen Koschmieder
- Department of Hematology, Oncology, Hemostaseology, and Stem Cell Transplantation, RWTH Aachen University, Aachen, Germany
| | - Thomas Schmitz-Rode
- Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| | - Robert Farkas
- Science Management, Institute of Applied Medical Engineering, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
42
|
Bannach-Brown A, Przybyła P, Thomas J, Rice ASC, Ananiadou S, Liao J, Macleod MR. Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 2019; 8:23. [PMID: 30646959 PMCID: PMC6334440 DOI: 10.1186/s13643-019-0942-7] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 01/03/2019] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review. METHODS We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis). RESULTS ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. CONCLUSIONS This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology.
Collapse
Affiliation(s)
- Alexandra Bannach-Brown
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark
- Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia
| | - Piotr Przybyła
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - James Thomas
- EPPI-Centre, Department of Social Science, University College London, London, England
| | - Andrew S. C. Rice
- Pain Research, Department of Surgery and Cancer, Imperial College, London, England
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - Jing Liao
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | | |
Collapse
|
43
|
Bannach-Brown A, Przybyła P, Thomas J, Rice ASC, Ananiadou S, Liao J, Macleod MR. Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error. Syst Rev 2019. [PMID: 30646959 DOI: 10.1186/s13643‐019‐0942‐7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review. METHODS We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis). RESULTS ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. CONCLUSIONS This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology.
Collapse
Affiliation(s)
- Alexandra Bannach-Brown
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland. .,Translational Neuropsychiatry Unit, Aarhus University, Aarhus, Denmark. .,Present Address: Centre for Research in Evidence-Based Practice, Bond University, Gold Coast, Australia.
| | - Piotr Przybyła
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - James Thomas
- EPPI-Centre, Department of Social Science, University College London, London, England
| | - Andrew S C Rice
- Pain Research, Department of Surgery and Cancer, Imperial College, London, England
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, England
| | - Jing Liao
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | | |
Collapse
|
44
|
Martin P, Surian D, Bashir R, Bourgeois FT, Dunn AG. Trial2rev: Combining machine learning and crowd-sourcing to create a shared space for updating systematic reviews. JAMIA Open 2019; 2:15-22. [PMID: 31984340 PMCID: PMC6951914 DOI: 10.1093/jamiaopen/ooy062] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 12/05/2018] [Accepted: 12/07/2018] [Indexed: 01/15/2023] Open
Abstract
Objectives Systematic reviews of clinical trials could be updated faster by automatically monitoring relevant trials as they are registered, completed, and reported. Our aim was to provide a public interface to a database of curated links between systematic reviews and trial registrations. Materials and Methods We developed the server-side system components in Python, connected them to a PostgreSQL database, and implemented the web-based user interface using Javascript, HTML, and CSS. All code is available on GitHub under an open source MIT license and registered users can access and download all available data. Results The trial2rev system is a web-based interface to a database that collates and augments information from multiple sources including bibliographic databases, the ClinicalTrials.gov registry, and the actions of registered users. Users interact with the system by browsing, searching, or adding systematic reviews, verifying links to trials included in the review, and adding or voting on trials that they would expect to include in an update of the systematic review. The system can trigger the actions of software agents that add or vote on included and relevant trials, in response to user interactions or by scheduling updates from external resources. Discussion and Conclusion We designed a publicly-accessible resource to help systematic reviewers make decisions about systematic review updates. Where previous approaches have sought to reactively filter published reports of trials for inclusion in systematic reviews, our approach is to proactively monitor for relevant trials as they are registered and completed.
Collapse
Affiliation(s)
- Paige Martin
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Didi Surian
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Rabia Bashir
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Florence T Bourgeois
- Computational Health Informatics Program, Children's Hospital Boston, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| | - Adam G Dunn
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| |
Collapse
|
45
|
Przybyła P, Brockmeier AJ, Kontonatsios G, Le Pogam M, McNaught J, von Elm E, Nolan K, Ananiadou S. Prioritising references for systematic reviews with RobotAnalyst: A user study. Res Synth Methods 2018; 9:470-488. [PMID: 29956486 PMCID: PMC6175382 DOI: 10.1002/jrsm.1311] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 04/12/2018] [Accepted: 06/16/2018] [Indexed: 11/07/2022]
Abstract
Screening references is a time-consuming step necessary for systematic reviews and guideline development. Previous studies have shown that human effort can be reduced by using machine learning software to prioritise large reference collections such that most of the relevant references are identified before screening is completed. We describe and evaluate RobotAnalyst, a Web-based software system that combines text-mining and machine learning algorithms for organising references by their content and actively prioritising them based on a relevancy classification model trained and updated throughout the process. We report an evaluation over 22 reference collections (most are related to public health topics) screened using RobotAnalyst with a total of 43 610 abstract-level decisions. The number of references that needed to be screened to identify 95% of the abstract-level inclusions for the evidence review was reduced on 19 of the 22 collections. Significant gains over random sampling were achieved for all reviews conducted with active prioritisation, as compared with only two of five when prioritisation was not used. RobotAnalyst's descriptive clustering and topic modelling functionalities were also evaluated by public health analysts. Descriptive clustering provided more coherent organisation than topic modelling, and the content of the clusters was apparent to the users across a varying number of clusters. This is the first large-scale study using technology-assisted screening to perform new reviews, and the positive results provide empirical evidence that RobotAnalyst can accelerate the identification of relevant studies. The results also highlight the issue of user complacency and the need for a stopping criterion to realise the work savings.
Collapse
Affiliation(s)
- Piotr Przybyła
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Austin J. Brockmeier
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Georgios Kontonatsios
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Marie‐Annick Le Pogam
- Cochrane Switzerland, Institute of Social and Preventive MedicineLausanne University HospitalLausanneSwitzerland
| | - John McNaught
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| | - Erik von Elm
- Cochrane Switzerland, Institute of Social and Preventive MedicineLausanne University HospitalLausanneSwitzerland
| | - Kay Nolan
- National Institute for Health and Care ExcellenceManchesterUK
| | - Sophia Ananiadou
- National Centre for Text MiningSchool of Computer Science, University of ManchesterManchesterUK
| |
Collapse
|
46
|
Yamada T, Kamata R, Ishinohachi K, Shojima N, Ananiadou S, Nom H, Yamauchi T, Kadowaki T. Biosimilar vs originator insulins: Systematic review and meta-analysis. Diabetes Obes Metab 2018. [PMID: 29536603 DOI: 10.1111/dom.13291] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Biosimilar insulins have expanded the treatment options for diabetes. We compared the clinical efficacy and safety of biosimilar insulins with those of originator insulins by conducting a meta-analysis. A random-effects meta-analysis was performed on randomized controlled trials comparing biosimilar and originator insulins in adults with diabetes. Studies were obtained by searching electronic databases up to December 2017. Ten trials, in a total of 4935 patients, were assessed (2 trials each on LY2963016, MK-1293, Mylan's insulin glargine and SAR342434, and 1 trial each on FFP-112 and Basalog). The meta-analysis found no differences between long-acting biosimilar and originator insulins with regard to reduction in glycated haemoglobin at 24 weeks (0.04%, 95% confidence interval [CI] -0.01, 0.08; P for efficacy = .14, I2 = 0%) or at 52 weeks (0.03%, 95% CI -0.04, 0.1), or reduction in fasting plasma glucose (0.08 mmol/L, 95% CI 0.36, 0.53), hypoglycaemia (odds ratio 0.99, 95% CI 0.96, 1.03), mortality, injection site reactions, insulin antibodies and allergic reactions. Analyses stratified by type of diabetes and prior insulin use yielded similar findings. Similarly, no significant differences were found between short-acting biosimilar and originator insulins. In summary, our meta-analysis showed no significant differences in clinical efficacy and safety, including immune reactions, between biosimilar and originator insulins. Biosimilar insulins can increase access to modern insulin therapy and reduce medical costs.
Collapse
Affiliation(s)
- Tomohide Yamada
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Ryuichi Kamata
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Kotomi Ishinohachi
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Nobuhiro Shojima
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, UK
| | - Hisashi Nom
- Department of Data Science, Institute of Statistical Mathematics, Tokyo, Japan
| | - Toshimasa Yamauchi
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Takashi Kadowaki
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| |
Collapse
|
47
|
Surian D, Dunn AG, Orenstein L, Bashir R, Coiera E, Bourgeois FT. A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updates. J Biomed Inform 2018; 79:32-40. [PMID: 29410356 DOI: 10.1016/j.jbi.2018.01.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 12/09/2017] [Accepted: 01/22/2018] [Indexed: 10/18/2022]
Abstract
BACKGROUND Clinical trial registries can be used to monitor the production of trial evidence and signal when systematic reviews become out of date. However, this use has been limited to date due to the extensive manual review required to search for and screen relevant trial registrations. Our aim was to evaluate a new method that could partially automate the identification of trial registrations that may be relevant for systematic review updates. MATERIALS AND METHODS We identified 179 systematic reviews of drug interventions for type 2 diabetes, which included 537 clinical trials that had registrations in ClinicalTrials.gov. Text from the trial registrations were used as features directly, or transformed using Latent Dirichlet Allocation (LDA) or Principal Component Analysis (PCA). We tested a novel matrix factorisation approach that uses a shared latent space to learn how to rank relevant trial registrations for each systematic review, comparing the performance to document similarity to rank relevant trial registrations. The two approaches were tested on a holdout set of the newest trials from the set of type 2 diabetes systematic reviews and an unseen set of 141 clinical trial registrations from 17 updated systematic reviews published in the Cochrane Database of Systematic Reviews. The performance was measured by the number of relevant registrations found after examining 100 candidates (recall@100) and the median rank of relevant registrations in the ranked candidate lists. RESULTS The matrix factorisation approach outperformed the document similarity approach with a median rank of 59 (of 128,392 candidate registrations in ClinicalTrials.gov) and recall@100 of 60.9% using LDA feature representation, compared to a median rank of 138 and recall@100 of 42.8% in the document similarity baseline. In the second set of systematic reviews and their updates, the highest performing approach used document similarity and gave a median rank of 67 (recall@100 of 62.9%). CONCLUSIONS A shared latent space matrix factorisation method was useful for ranking trial registrations to reduce the manual workload associated with finding relevant trials for systematic review updates. The results suggest that the approach could be used as part of a semi-automated pipeline for monitoring potentially new evidence for inclusion in a review update.
Collapse
Affiliation(s)
- Didi Surian
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
| | - Adam G Dunn
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Liat Orenstein
- Computational Health Informatics Program, Boston Children's Hospital, Boston, United States
| | - Rabia Bashir
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Enrico Coiera
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Florence T Bourgeois
- Computational Health Informatics Program, Boston Children's Hospital, Boston, United States; Department of Pediatrics, Harvard Medical School, Boston, United States
| |
Collapse
|
48
|
Unreported links between trial registrations and published articles were identified using document similarity measures in a cross-sectional analysis of ClinicalTrials.gov. J Clin Epidemiol 2017; 95:94-101. [PMID: 29277557 DOI: 10.1016/j.jclinepi.2017.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Revised: 11/24/2017] [Accepted: 12/14/2017] [Indexed: 12/14/2022]
Abstract
OBJECTIVES Trial registries can be used to measure reporting biases and support systematic reviews, but 45% of registrations do not provide a link to the article reporting on the trial. We evaluated the use of document similarity methods to identify unreported links between ClinicalTrials.gov and PubMed. STUDY DESIGN AND SETTING We extracted terms and concepts from a data set of 72,469 ClinicalTrials.gov registrations and 276,307 PubMed articles and tested methods for ranking articles across 16,005 reported links and 90 manually identified unreported links. Performance was measured by the median rank of matching articles and the proportion of unreported links that could be found by screening ranked candidate articles in order. RESULTS The best-performing concept-based representation produced a median rank of 3 (interquartile range [IQR] 1-21) for reported links and 3 (IQR 1-19) for the manually identified unreported links, and term-based representations produced a median rank of 2 (1-20) for reported links and 2 (IQR 1-12) in unreported links. The matching article was ranked first for 40% of registrations, and screening 50 candidate articles per registration identified 86% of the unreported links. CONCLUSION Leveraging the growth in the corpus of reported links between ClinicalTrials.gov and PubMed, we found that document similarity methods can assist in the identification of unreported links between trial registrations and corresponding articles.
Collapse
|
49
|
Howard J, Piacentino J, MacMahon K, Schulte P. Using systematic review in occupational safety and health. Am J Ind Med 2017; 60:921-929. [PMID: 28944489 DOI: 10.1002/ajim.22771] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2017] [Indexed: 12/15/2022]
Abstract
Evaluation of scientific evidence is critical in developing recommendations to reduce risk. Healthcare was the first scientific field to employ a systematic review approach for synthesizing research findings to support evidence-based decision-making and it is still the largest producer and consumer of systematic reviews. Systematic reviews in the field of occupational safety and health are being conducted, but more widespread use and adoption would strengthen assessments. In 2016, NIOSH asked RAND to develop a framework for applying the traditional systematic review elements to the field of occupational safety and health. This paper describes how essential systematic review elements can be adapted for use in occupational systematic reviews to enhance their scientific quality, objectivity, transparency, reliability, utility, and acceptability.
Collapse
Affiliation(s)
- John Howard
- National Institute for Occupational Safety and Health, Washington, District of Columbia
| | - John Piacentino
- National Institute for Occupational Safety and Health, Washington, District of Columbia
| | - Kathleen MacMahon
- National Institute for Occupational Safety and Health, Washington, District of Columbia
| | - Paul Schulte
- National Institute for Occupational Safety and Health, Washington, District of Columbia
| |
Collapse
|
50
|
Risk of bias reporting in the recent animal focal cerebral ischaemia literature. Clin Sci (Lond) 2017; 131:2525-2532. [PMID: 29026002 PMCID: PMC5869854 DOI: 10.1042/cs20160722] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2017] [Revised: 08/19/2017] [Accepted: 09/07/2017] [Indexed: 01/31/2023]
Abstract
Background: Findings from in vivo research may be less reliable where studies do not report measures to reduce risks of bias. The experimental stroke community has been at the forefront of implementing changes to improve reporting, but it is not known whether these efforts are associated with continuous improvements. Our aims here were firstly to validate an automated tool to assess risks of bias in published works, and secondly to assess the reporting of measures taken to reduce the risk of bias within recent literature for two experimental models of stroke. Methods: We developed and used text analytic approaches to automatically ascertain reporting of measures to reduce risk of bias from full-text articles describing animal experiments inducing middle cerebral artery occlusion (MCAO) or modelling lacunar stroke. Results: Compared with previous assessments, there were improvements in the reporting of measures taken to reduce risks of bias in the MCAO literature but not in the lacunar stroke literature. Accuracy of automated annotation of risk of bias in the MCAO literature was 86% (randomization), 94% (blinding) and 100% (sample size calculation); and in the lacunar stroke literature accuracy was 67% (randomization), 91% (blinding) and 96% (sample size calculation). Discussion: There remains substantial opportunity for improvement in the reporting of animal research modelling stroke, particularly in the lacunar stroke literature. Further, automated tools perform sufficiently well to identify whether studies report blinded assessment of outcome, but improvements are required in the tools to ascertain whether randomization and a sample size calculation were reported.
Collapse
|