1
|
Kon MWR, Rojas-Carabali W, Cifuentes-Gonzalez C, Agrawal R. Meta-mistake: are fragile meta-analyses in ophthalmology worth the high cost? Eye (Lond) 2024:10.1038/s41433-024-03331-7. [PMID: 39251888 DOI: 10.1038/s41433-024-03331-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 08/20/2024] [Accepted: 09/05/2024] [Indexed: 09/11/2024] Open
Affiliation(s)
- Mattias Wei Ren Kon
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - William Rojas-Carabali
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | | | - Rupesh Agrawal
- Department of Ophthalmology, Tan Tock Seng Hospital, Singapore, Singapore.
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.
| |
Collapse
|
2
|
Komoda DS, Cardoso MMDA, Fernandes BD, Visacri MB, Correa CRS. Artificial intelligence applied in human health technology assessment: a scoping review protocol. JBI Evid Synth 2024:02174543-990000000-00347. [PMID: 39224910 DOI: 10.11124/jbies-23-00377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
OBJECTIVE This scoping review aims to map studies that applied artificial intelligence (AI) tools to perform health technology assessment tasks in human health care. The review also aims to understand specific processes in which the AI tools were applied and to comprehend the technical characteristics of these tools. INTRODUCTION Health technology assessment is a complex, time-consuming, and labor-intensive endeavor. The development of automation techniques using AI has opened up new avenues for accelerating such assessments in human health settings. This could potentially aid health technology assessment researchers and decision-makers to deliver higher quality evidence. INCLUSION CRITERIA This review will consider studies that assesses the use of AI tools in any process of health technology assessment in human health. However, publications in which AI is a means of clinical aid, such as diagnostics or surgery will be excluded. METHODS A search for relevant articles will be conducted in databases such as CINAHL (EBSCOhost), Embase (Ovid), MEDLINE (PubMed), Science Direct, Computer and Applied Sciences Complete (EBSCOhost), LILACS, Scopus, and Web of Science Core Collection. A search for gray literature will be conducted in GreyLit.Org, ProQuest Dissertations and Theses, Google Scholar, and the Google search engine. No language filters will be applied. Screening, selection, and data extraction will be performed by 2 independent reviewers. The results will be presented in graphic and tabular format, accompanied by a narrative summary. DETAILS OF THIS REVIEW CAN BE FOUND IN OPEN SCIENCE FRAMEWORK osf.io/3rm8g.
Collapse
Affiliation(s)
- Denis Satoshi Komoda
- Department of Collective Health, Faculty of Medical Sciences, University of Campinas, Campinas, SP, Brazil
| | - Marilia Mastrocolla de Almeida Cardoso
- Brazilian Centre for Evidence-based Healthcare: A JBI Centre of Excellence, São Paulo, SP, Brazil
- Health Technology Assessment Center, Hospital das Clinicas of Medical School (FMB) of São Paulo State University (Unesp), Botucatu, SP, Brazil
| | | | - Marília Berlofa Visacri
- Department of Pharmacy, Faculty of Pharmaceutical Sciences, University of São Paulo, São Paulo, SP, Brazil
| | | |
Collapse
|
3
|
Forbes C, Greenwood H, Carter M, Clark J. Automation of duplicate record detection for systematic reviews: Deduplicator. Syst Rev 2024; 13:206. [PMID: 39095913 PMCID: PMC11295717 DOI: 10.1186/s13643-024-02619-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 07/18/2024] [Indexed: 08/04/2024] Open
Abstract
BACKGROUND To describe the algorithm and investigate the efficacy of a novel systematic review automation tool "the Deduplicator" to remove duplicate records from a multi-database systematic review search. METHODS We constructed and tested the efficacy of the Deduplicator tool by using 10 previous Cochrane systematic review search results to compare the Deduplicator's 'balanced' algorithm to a semi-manual EndNote method. Two researchers each performed deduplication on the 10 libraries of search results. For five of those libraries, one researcher used the Deduplicator, while the other performed semi-manual deduplication with EndNote. They then switched methods for the remaining five libraries. In addition to this analysis, comparison between the three different Deduplicator algorithms ('balanced', 'focused' and 'relaxed') was performed on two datasets of previously deduplicated search results. RESULTS Before deduplication, the mean library size for the 10 systematic reviews was 1962 records. When using the Deduplicator, the mean time to deduplicate was 5 min per 1000 records compared to 15 min with EndNote. The mean error rate with Deduplicator was 1.8 errors per 1000 records in comparison to 3.1 with EndNote. Evaluation of the different Deduplicator algorithms found that the 'balanced' algorithm had the highest mean F1 score of 0.9647. The 'focused' algorithm had the highest mean accuracy of 0.9798 and the highest recall of 0.9757. The 'relaxed' algorithm had the highest mean precision of 0.9896. CONCLUSIONS This demonstrates that using the Deduplicator for duplicate record detection reduces the time taken to deduplicate, while maintaining or improving accuracy compared to using a semi-manual EndNote method. However, further research should be performed comparing more deduplication methods to establish relative performance of the Deduplicator against other deduplication methods.
Collapse
Affiliation(s)
- Connor Forbes
- Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Australia.
| | - Hannah Greenwood
- Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Australia
| | - Matt Carter
- Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Australia
| | - Justin Clark
- Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Australia
| |
Collapse
|
4
|
Khraisha Q, Put S, Kappenberg J, Warraitch A, Hadfield K. Can large language models replace humans in systematic reviews? Evaluating GPT-4's efficacy in screening and extracting data from peer-reviewed and grey literature in multiple languages. Res Synth Methods 2024; 15:616-626. [PMID: 38484744 DOI: 10.1002/jrsm.1715] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 02/06/2024] [Accepted: 02/25/2024] [Indexed: 07/13/2024]
Abstract
Systematic reviews are vital for guiding practice, research and policy, although they are often slow and labour-intensive. Large language models (LLMs) could speed up and automate systematic reviews, but their performance in such tasks has yet to be comprehensively evaluated against humans, and no study has tested Generative Pre-Trained Transformer (GPT)-4, the biggest LLM so far. This pre-registered study uses a "human-out-of-the-loop" approach to evaluate GPT-4's capability in title/abstract screening, full-text review and data extraction across various literature types and languages. Although GPT-4 had accuracy on par with human performance in some tasks, results were skewed by chance agreement and dataset imbalance. Adjusting for these caused performance scores to drop across all stages: for data extraction, performance was moderate, and for screening, it ranged from none in highly balanced literature datasets (~1:1) to moderate in those datasets where the ratio of inclusion to exclusion in studies was imbalanced (~1:3). When screening full-text literature using highly reliable prompts, GPT-4's performance was more robust, reaching "human-like" levels. Although our findings indicate that, currently, substantial caution should be exercised if LLMs are being used to conduct systematic reviews, they also offer preliminary evidence that, for certain review tasks delivered under specific conditions, LLMs can rival human performance.
Collapse
Affiliation(s)
- Qusai Khraisha
- Trinity Centre for Global Health, Trinity College Dublin, Dublin, Ireland
- School of Psychology, Trinity College Dublin, Dublin, Ireland
| | - Sophie Put
- Department of Education, York University, York, UK
| | | | - Azza Warraitch
- Trinity Centre for Global Health, Trinity College Dublin, Dublin, Ireland
- School of Psychology, Trinity College Dublin, Dublin, Ireland
| | - Kristin Hadfield
- Trinity Centre for Global Health, Trinity College Dublin, Dublin, Ireland
- School of Psychology, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
5
|
Braud SC, Treger D, Lizardi JJ, Boghosian T, El Abd R, Arakelians A, Jabori SK, Thaller SR. The Top 100 Most-Cited Publications in Clinical Craniofacial Research. J Craniofac Surg 2024; 35:1372-1378. [PMID: 38709050 DOI: 10.1097/scs.0000000000010185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 03/05/2024] [Indexed: 05/07/2024] Open
Abstract
INTRODUCTION Craniosynostosis is a birth defect defined as premature closure of sutures leading to possible neurological deficits and cosmetic deformities. Most of the current literature to date focuses on craniosynostosis etiology by analyzing genetics. This paper is a bibliometric analysis of the most influential works related to the clinical management of craniosynostosis to help guide clinicians in their decision-making. METHODS AND MATERIALS Clarivate Web of Science database was used to identify the top 100 most-cited articles addressing the clinical management of craniosynostosis. A bibliometric review was performed to analyze publication metrics and track research trends. RESULTS The 100 most-cited publications pertaining to craniosynostosis management were cited a cumulative 12,779 times. The highest cited article was Shillito and colleagues' "Craniosynostosis: A Review Of 519 Surgical Patients" with 352 citations. The oldest clinical craniosynostosis article dates back to 1948, and the most recent was published in 2016. The year with the most clinical-focused publications was 2011. The most prolific author was Renier, D. The United States produced 56 of the 100 articles. Most articles (n=52) were level 3 evidence. DISCUSSION This bibliometric evaluation of craniosynostosis provides insight into the most impactful literature on this topic. The highest cited articles retrospectively analyze large sample sizes, outline proper evaluation, discuss intervention timelines, and highlight specific treatment plans for this birth defect. By filtering through existing literature, this analysis can guide clinicians on the management of craniosynostosis to maximize patient outcomes.
Collapse
Affiliation(s)
- Savannah C Braud
- Florida Atlantic University Schmidt College of Medicine, Boca Raton, FL
| | - Dylan Treger
- Department of Education, The University of Miami Leonard M. Miller School of Medicine, Miami, FL
| | - Juan J Lizardi
- Department of Education, The University of Miami Leonard M. Miller School of Medicine, Miami, FL
| | | | - Rawan El Abd
- Division of Plastic and Reconstructive Surgery, McGill University Health Centre, Montreal, QC, Canada
| | - Aris Arakelians
- Department of Education, The University of Miami Leonard M. Miller School of Medicine, Miami, FL
| | - Sinan K Jabori
- Division of Plastic Surgery, University of Miami Hospital, Dewitt Daughtry Department of Surgery, Miami, FL
| | - Seth R Thaller
- Division of Plastic Surgery, University of Miami Hospital, Dewitt Daughtry Department of Surgery, Miami, FL
| |
Collapse
|
6
|
Cuker A, Kunkle R, Bercovitz RS, Byrne M, Djulbegovic B, Haberichter SL, Holter-Chakrabarty J, Lottenberg R, Pai M, Rezende SM, Seftel MD, Silverstein RL, Terrell DR, Cheung MC. Distinguishing ASH clinical practice guidelines from other forms of ASH clinical advice. Blood Adv 2024; 8:2960-2963. [PMID: 38593461 PMCID: PMC11302374 DOI: 10.1182/bloodadvances.2023011102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/12/2024] [Accepted: 03/01/2024] [Indexed: 04/11/2024] Open
Abstract
ABSTRACT The American Society of Hematology (ASH) develops a variety of resources that provide guidance to clinicians on the diagnosis and management of blood diseases. These resources include clinical practice guidelines (CPGs) and other forms of clinical advice. Although both ASH CPGs and other forms of clinical advice provide recommendations, they differ with respect to the methods underpinning their development, the principal type of recommendations they offer, their transparency and concordance with published evidence, and the time and resources required for their development. It is crucial that end users be aware of the differences between CPGs and other forms of clinical advice and that producers and publishers of these resources use clear and unambiguous terminology to facilitate their distinction. The objective of this article is to highlight the similarities and differences between ASH CPGs and other forms of ASH clinical advice and discuss the implications of these differences for end users.
Collapse
Affiliation(s)
- Adam Cuker
- Department of Medicine and Department of Pathology & Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | | | - Rachel S. Bercovitz
- Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL
| | | | - Benjamin Djulbegovic
- Division of Hematology/Oncology, Department of Medicine, Medical University of South Carolina, Charleston, SC
| | - Sandra L. Haberichter
- Versiti Diagnostic Labs and Blood Research Institute, Milwaukee, WI
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI
| | - Jennifer Holter-Chakrabarty
- Department of Medicine, BMT and Cellular Therapy, Stephenson Cancer Center, The University of Oklahoma, Oklahoma City, OK
| | | | - Menaka Pai
- Department of Medicine, McMaster University, Hamilton, Canada
| | - Suely M. Rezende
- Department of Internal Medicine, Faculty of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Matthew D. Seftel
- Canadian Blood Services, The University of British Columbia, Vancouver, Canada
| | - Roy L. Silverstein
- Department of Medicine, Medical College of Wisconsin and Versiti Blood Research Institute, Milwaukee, WI
| | - Deirdra R. Terrell
- Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK
| | - Matthew C. Cheung
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, Canada
| |
Collapse
|
7
|
Du J, Soysal E, Wang D, He L, Lin B, Wang J, Manion FJ, Li Y, Wu E, Yao L. Machine learning models for abstract screening task - A systematic literature review application for health economics and outcome research. BMC Med Res Methodol 2024; 24:108. [PMID: 38724903 PMCID: PMC11080200 DOI: 10.1186/s12874-024-02224-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 04/18/2024] [Indexed: 05/13/2024] Open
Abstract
OBJECTIVE Systematic literature reviews (SLRs) are critical for life-science research. However, the manual selection and retrieval of relevant publications can be a time-consuming process. This study aims to (1) develop two disease-specific annotated corpora, one for human papillomavirus (HPV) associated diseases and the other for pneumococcal-associated pediatric diseases (PAPD), and (2) optimize machine- and deep-learning models to facilitate automation of the SLR abstract screening. METHODS This study constructed two disease-specific SLR screening corpora for HPV and PAPD, which contained citation metadata and corresponding abstracts. Performance was evaluated using precision, recall, accuracy, and F1-score of multiple combinations of machine- and deep-learning algorithms and features such as keywords and MeSH terms. RESULTS AND CONCLUSIONS The HPV corpus contained 1697 entries, with 538 relevant and 1159 irrelevant articles. The PAPD corpus included 2865 entries, with 711 relevant and 2154 irrelevant articles. Adding additional features beyond title and abstract improved the performance (measured in Accuracy) of machine learning models by 3% for HPV corpus and 2% for PAPD corpus. Transformer-based deep learning models that consistently outperformed conventional machine learning algorithms, highlighting the strength of domain-specific pre-trained language models for SLR abstract screening. This study provides a foundation for the development of more intelligent SLR systems.
Collapse
Affiliation(s)
| | - Ekin Soysal
- Intelligent Medical Objects, Houston, TX, USA
- McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | | | - Long He
- Intelligent Medical Objects, Houston, TX, USA
| | - Bin Lin
- Intelligent Medical Objects, Houston, TX, USA
| | - Jingqi Wang
- Intelligent Medical Objects, Houston, TX, USA
| | | | - Yeran Li
- Merck & Co., Inc, Rahway, NJ, USA
| | - Elise Wu
- Merck & Co., Inc, Rahway, NJ, USA
| | | |
Collapse
|
8
|
De Silva DTN, Moore BR, Strunk T, Petrovski M, Varis V, Chai K, Ng L, Batty K. Development of a pharmaceutical science systematic review process using a semi-automated machine learning tool: Intravenous drug compatibility in the neonatal intensive care setting. Pharmacol Res Perspect 2024; 12:e1170. [PMID: 38204432 PMCID: PMC10782215 DOI: 10.1002/prp2.1170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 10/30/2023] [Accepted: 12/11/2023] [Indexed: 01/12/2024] Open
Abstract
Our objective was to establish and test a machine learning-based screening process that would be applicable to systematic reviews in pharmaceutical sciences. We used the SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) model, a broad search strategy, and a machine learning tool (Research Screener) to identify relevant references related to y-site compatibility of 95 intravenous drugs used in neonatal intensive care settings. Two independent reviewers conducted pilot studies, including manual screening and evaluation of Research Screener, and used the kappa-coefficient for inter-reviewer reliability. After initial deduplication of the search strategy results, 27 597 references were available for screening. Research Screener excluded 1735 references, including 451 duplicate titles and 1269 reports with no abstract/title, which were manually screened. The remainder (25 862) were subject to the machine learning screening process. All eligible articles for the systematic review were extracted from <10% of the references available for screening. Moderate inter-reviewer reliability was achieved, with kappa-coefficient ≥0.75. Overall, 324 references were subject to full-text reading and 118 were deemed relevant for the systematic review. Our study showed that a broad search strategy to optimize the literature captured for systematic reviews can be efficiently screened by the semi-automated machine learning tool, Research Screener.
Collapse
Affiliation(s)
| | - Brioni R. Moore
- Curtin Medical SchoolCurtin UniversityPerthWestern AustraliaAustralia
- Curtin Health Innovation Research InstituteCurtin UniversityPerthWestern AustraliaAustralia
- Medical SchoolThe University of Western AustraliaCrawleyWestern AustraliaAustralia
- Wesfarmers Centre for Vaccines and Infectious DiseasesTelethon Kids InstituteNedlandsWestern AustraliaAustralia
| | - Tobias Strunk
- Medical SchoolThe University of Western AustraliaCrawleyWestern AustraliaAustralia
- Wesfarmers Centre for Vaccines and Infectious DiseasesTelethon Kids InstituteNedlandsWestern AustraliaAustralia
- Neonatal DirectorateKing Edward Memorial Hospital, Child and Adolescent Health ServiceSubiacoWestern AustraliaAustralia
| | - Michael Petrovski
- Pharmacy Department, King Edward Memorial HospitalWomen and Newborn Health ServiceSubiacoWestern AustraliaAustralia
| | - Vanessa Varis
- University Library, Curtin UniversityPerthWestern AustraliaAustralia
| | - Kevin Chai
- School of Population HealthCurtin UniversityPerthWestern AustraliaAustralia
| | - Leo Ng
- Curtin School of Allied HealthCurtin UniversityPerthWestern AustraliaAustralia
- School of Health SciencesSwinburne University of TechnologyHawthornVictoriaAustralia
| | - Kevin T. Batty
- Curtin Medical SchoolCurtin UniversityPerthWestern AustraliaAustralia
- Curtin Health Innovation Research InstituteCurtin UniversityPerthWestern AustraliaAustralia
| |
Collapse
|
9
|
Rajit D, Johnson A, Callander E, Teede H, Enticott J. Learning health systems and evidence ecosystems: a perspective on the future of evidence-based medicine and evidence-based guideline development. Health Res Policy Syst 2024; 22:4. [PMID: 38178086 PMCID: PMC10768258 DOI: 10.1186/s12961-023-01095-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 12/14/2023] [Indexed: 01/06/2024] Open
Abstract
Despite forming the cornerstone of modern clinical practice for decades, implementation of evidence-based medicine at scale remains a crucial challenge for health systems. As a result, there has been a growing need for conceptual models to better contextualise and pragmatize the use of evidence-based medicine, particularly in tandem with patient-centred care. In this commentary, we highlight the emergence of the learning health system as one such model and analyse its potential role in pragmatizing both evidence-based medicine and patient-centred care. We apply the learning health system lens to contextualise the key activity of evidence-based guideline development and implementation, and highlight how current inefficiencies and bottlenecks in the evidence synthesis phase of evidence-based guideline development threaten downstream adherence. Lastly, we introduce the evidence ecosystem as a complementary model to learning health systems, and propose how innovative developments from the evidence ecosystem may be integrated with learning health systems to better enable health impact at speed and scale.
Collapse
Affiliation(s)
- D Rajit
- Monash Centre for Health Research and Implementation, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Level 1, 43-51 Kanooka Grove, Melbourne, VIC, 3168, Australia
| | - A Johnson
- Monash Partners Academic Health Sciences Centre, Melbourne, VIC, Australia
| | - E Callander
- Monash Centre for Health Research and Implementation, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Level 1, 43-51 Kanooka Grove, Melbourne, VIC, 3168, Australia
- Monash Partners Academic Health Sciences Centre, Melbourne, VIC, Australia
- School of Public Health, Faculty of Health, University of Technology Sydney, Sydney, NSW, Australia
| | - H Teede
- Monash Centre for Health Research and Implementation, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Level 1, 43-51 Kanooka Grove, Melbourne, VIC, 3168, Australia
- Monash Partners Academic Health Sciences Centre, Melbourne, VIC, Australia
- Monash Health Endocrinology and Diabetes Departments, Melbourne, Australia
| | - J Enticott
- Monash Centre for Health Research and Implementation, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Level 1, 43-51 Kanooka Grove, Melbourne, VIC, 3168, Australia.
- Monash Partners Academic Health Sciences Centre, Melbourne, VIC, Australia.
| |
Collapse
|
10
|
Panayi A, Ward K, Benhadji-Schaff A, Ibanez-Lopez AS, Xia A, Barzilay R. Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews. Syst Rev 2023; 12:187. [PMID: 37803451 PMCID: PMC10557215 DOI: 10.1186/s13643-023-02351-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 09/13/2023] [Indexed: 10/08/2023] Open
Abstract
BACKGROUND Evidence-based medicine requires synthesis of research through rigorous and time-intensive systematic literature reviews (SLRs), with significant resource expenditure for data extraction from scientific publications. Machine learning may enable the timely completion of SLRs and reduce errors by automating data identification and extraction. METHODS We evaluated the use of machine learning to extract data from publications related to SLRs in oncology (SLR 1) and Fabry disease (SLR 2). SLR 1 predominantly contained interventional studies and SLR 2 observational studies. Predefined key terms and data were manually annotated to train and test bidirectional encoder representations from transformers (BERT) and bidirectional long-short-term memory machine learning models. Using human annotation as a reference, we assessed the ability of the models to identify biomedical terms of interest (entities) and their relations. We also pretrained BERT on a corpus of 100,000 open access clinical publications and/or enhanced context-dependent entity classification with a conditional random field (CRF) model. Performance was measured using the F1 score, a metric that combines precision and recall. We defined successful matches as partial overlap of entities of the same type. RESULTS For entity recognition, the pretrained BERT+CRF model had the best performance, with an F1 score of 73% in SLR 1 and 70% in SLR 2. Entity types identified with the highest accuracy were metrics for progression-free survival (SLR 1, F1 score 88%) or for patient age (SLR 2, F1 score 82%). Treatment arm dosage was identified less successfully (F1 scores 60% [SLR 1] and 49% [SLR 2]). The best-performing model for relation extraction, pretrained BERT relation classification, exhibited F1 scores higher than 90% in cases with at least 80 relation examples for a pair of related entity types. CONCLUSIONS The performance of BERT is enhanced by pretraining with biomedical literature and by combining with a CRF model. With refinement, machine learning may assist with manual data extraction for SLRs.
Collapse
Affiliation(s)
- Antonia Panayi
- Takeda Pharmaceuticals International AG, Thurgauerstrasse 130, 8152, Glattpark-Opfikon, Zurich, Switzerland.
| | | | | | | | - Andrew Xia
- Takeda Pharmaceuticals International AG, Thurgauerstrasse 130, 8152, Glattpark-Opfikon, Zurich, Switzerland
| | | |
Collapse
|
11
|
Petticrew M, Glover RE, Volmink J, Blanchard L, Cott É, Knai C, Maani N, Thomas J, Tompson A, van Schalkwyk MCI, Welch V. The Commercial Determinants of Health and Evidence Synthesis (CODES): methodological guidance for systematic reviews and other evidence syntheses. Syst Rev 2023; 12:165. [PMID: 37710334 PMCID: PMC10503085 DOI: 10.1186/s13643-023-02323-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 08/15/2023] [Indexed: 09/16/2023] Open
Abstract
BACKGROUND The field of the commercial determinants of health (CDOH) refers to the commercial products, pathways and practices that may affect health. The field is growing rapidly, as evidenced by the WHO programme on the economic and commercial determinants of health and a rise in researcher and funder interest. Systematic reviews (SRs) and evidence synthesis more generally will be crucial tools in the evolution of CDOH as a field. Such reviews can draw on existing methodological guidance, though there are areas where existing methods are likely to differ, and there is no overarching guidance on the conduct of CDOH-focussed systematic reviews, or guidance on the specific methodological and conceptual challenges. METHODS/RESULTS CODES provides guidance on the conduct of systematic reviews focussed on CDOH, from shaping the review question with input from stakeholders, to disseminating the review. Existing guidance was used to identify key stages and to provide a structure for the guidance. The writing group included experience in systematic reviews and other forms of evidence synthesis, and in equity and CDOH research (both primary research and systematic reviews). CONCLUSIONS This guidance highlights the special methodological and other considerations for CDOH reviews, including equity considerations, and pointers to areas for future methodological and guideline development. It should contribute to the reliability and utility of CDOH reviews and help stimulate the production of reviews in this growing field.
Collapse
Affiliation(s)
- Mark Petticrew
- Faculty of Public Health and Policy, LSHTM, London, WC1H 9SH, UK.
| | - Rebecca E Glover
- Faculty of Public Health and Policy, LSHTM, London, WC1H 9SH, UK
| | - Jimmy Volmink
- Department of Global Health, Faculty of Medicine and Health Sciences, Stellenbosch University, Stellenbosch, South Africa
| | | | | | - Cécile Knai
- Faculty of Public Health and Policy, LSHTM, London, WC1H 9SH, UK
| | - Nason Maani
- Global Health Policy Unit, School of Social and Political Science, University of Edinburgh, Edinburgh, EH8 9LD, UK
| | - James Thomas
- UCL Institute of Education, University College London, 20 Bedford Way, London, WC1H 0AL, UK
| | - Alice Tompson
- Faculty of Public Health and Policy, LSHTM, London, WC1H 9SH, UK
| | | | - Vivian Welch
- Bruyère Research Institute, University of Ottawa, Ottawa, Canada
| |
Collapse
|
12
|
Gebrye T, Fatoye F, Mbada C, Hakimi Z. A scoping review on quality assessment tools used in systematic reviews and meta-analysis of real-world studies. Rheumatol Int 2023; 43:1573-1581. [PMID: 37326665 PMCID: PMC10348931 DOI: 10.1007/s00296-023-05354-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 05/26/2023] [Indexed: 06/17/2023]
Abstract
Risk of bias tools is important in identifying inherent methodical flaws and for generating evidence in studies involving systematic reviews (SRs) and meta-analyses (MAs), hence the need for sensitive and study-specific tools. This study aimed to review quality assessment (QA) tools used in SRs and MAs involving real-world data. Electronic databases involving PubMed, Allied and Complementary Medicine Database, Cumulated Index to Nursing and Allied Health Literature, and MEDLINE were searched for SRs and MAs involving real-world data. Search was delimited to articles published in English, and between inception to 20 of November 2022 following the SRs and MAs extension for scoping checklist. Sixteen articles on real-world data published between 2016 and 2021 that reported their methodological quality met the inclusion criteria. Seven of these articles were observational studies, while the others were of interventional type. Overall, 16 QA tools were identified. Except one, all the QA tools employed in SRs and MAs involving real-world data are generic, and only three of these were validated. Generic QA tools are mostly used for real-world data SRs and MAs, while no validated and reliable specific tool currently exist. Thus, there is need for a standardized and specific QA tool of SRs and MAs for real-world data.
Collapse
Affiliation(s)
- Tadesse Gebrye
- Department of Health Professions, Faculty of Health, Psychology, and Social Care, Manchester Metropolitan University, Brooks Building, Birley Fields Campus, Bonsall Street, 53 Bonsall Street, Manchester, M15 6GX UK
| | - Francis Fatoye
- Department of Health Professions, Faculty of Health, Psychology, and Social Care, Manchester Metropolitan University, Brooks Building, Birley Fields Campus, Bonsall Street, 53 Bonsall Street, Manchester, M15 6GX UK
- Lifestyle Diseases, Faculty of Health Sciences, North-West University, Mahikeng, South Africa
| | - Chidozie Mbada
- Department of Health Professions, Faculty of Health, Psychology, and Social Care, Manchester Metropolitan University, Brooks Building, Birley Fields Campus, Bonsall Street, 53 Bonsall Street, Manchester, M15 6GX UK
| | | |
Collapse
|
13
|
Qureshi R, Shaughnessy D, Gill KAR, Robinson KA, Li T, Agai E. Are ChatGPT and large language models "the answer" to bringing us closer to systematic review automation? Syst Rev 2023; 12:72. [PMID: 37120563 PMCID: PMC10148473 DOI: 10.1186/s13643-023-02243-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 04/20/2023] [Indexed: 05/01/2023] Open
Abstract
In this commentary, we discuss ChatGPT and our perspectives on its utility to systematic reviews (SRs) through the appropriateness and applicability of its responses to SR related prompts. The advancement of artificial intelligence (AI)-assisted technologies leave many wondering about the current capabilities, limitations, and opportunities for integration AI into scientific endeavors. Large language models (LLM)-such as ChatGPT, designed by OpenAI-have recently gained widespread attention with their ability to respond to various prompts in a natural-sounding way. Systematic reviews (SRs) utilize secondary data and often require many months and substantial financial resources to complete, making them attractive grounds for developing AI-assistive technologies. On February 6, 2023, PICO Portal developers hosted a webinar to explore ChatGPT's responses to tasks related to SR methodology. Our experience from exploring the responses of ChatGPT suggest that while ChatGPT and LLMs show some promise for aiding in SR-related tasks, the technology is in its infancy and needs much development for such applications. Furthermore, we advise that great caution should be taken by non-content experts in using these tools due to much of the output appearing, at a high level, to be valid, while much is erroneous and in need of active vetting.
Collapse
Affiliation(s)
- Riaz Qureshi
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
- PICO Portal, New York, NY, USA.
| | | | - Kayden A R Gill
- PICO Portal, New York, NY, USA
- University of Pittsburgh, Pittsburgh, PA, USA
| | - Karen A Robinson
- PICO Portal, New York, NY, USA
- Johns Hopkins University, Baltimore, MD, USA
| | - Tianjing Li
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- PICO Portal, New York, NY, USA
| | | |
Collapse
|
14
|
Trepanier L, Hébert C, Stamoulos C, Reyes A, MacIntosh H, Beauchamp S, Larivée S, Dagenais C, Drapeau M. The quality of four psychology practice guidelines using the Appraisal of Guidelines for Research and Evaluation (AGREE) II. J Eval Clin Pract 2022; 28:1138-1146. [PMID: 35599434 DOI: 10.1111/jep.13699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 11/30/2022]
Abstract
RATIONALE, AIMS AND OBJECTIVES Clinical Practice Guidelines (CPGs) have been shown to improve healthcare services and clinical outcomes. However, they are useful resources only to the degree that they are developed according to the most rigorous standards. Multiple studies have demonstrated significant variability between CPGs with regard to specific indicators of quality. The Ordre des psychologues du Québec (OPQ), the College of psychologists of Quebec, has published several CPGs that are intended to provide empirically supported guidance for psychologists in the areas of assessment, diagnosis, general functioning, treatment and other decision-making support. The aim of this study was to evaluate the quality of these CPGs. METHODS The Appraisal of Guidelines for Research and Evaluation II (AGREE II) instrument was used to assess the quality of the CPGs. RESULTS Our results show that although there have been some modest improvements in quality of the CPGs over time, there are important methodological inadequacies in all CPGs evaluated. CONCLUSIONS The findings of this study demonstrate the need for more methodological rigour in CPGs development as such, recommendations to improve CPG quality are discussed.
Collapse
Affiliation(s)
- Lyane Trepanier
- Department of Counselling Psychology, McGill University, Montreal, Quebec, Canada
| | - Catherine Hébert
- Department of Counselling Psychology, McGill University, Montreal, Quebec, Canada
| | | | - Andrea Reyes
- Department of Counselling Psychology, McGill University, Montreal, Quebec, Canada
| | | | - Sylvie Beauchamp
- Department of Counselling Psychology, McGill University, Montreal, Quebec, Canada.,Direction des affaires universitaires, de l'enseignement et de la recherche, Montreal West Island Integrated University Health and Social Services Center, Montreal, Quebec, Canada
| | - Serge Larivée
- School of Psychoeducation, University of Montreal, Montreal, Quebec, Canada
| | - Christian Dagenais
- Department of Psychology, University of Montreal, Montreal, Quebec, Canada
| | - Martin Drapeau
- Department of Counselling Psychology, McGill University, Montreal, Quebec, Canada.,Department of Psychiatry, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
15
|
Uthman OA, Court R, Enderby J, Al-Khudairy L, Nduka C, Mistry H, Melendez-Torres GJ, Taylor-Phillips S, Clarke A. Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning. Health Technol Assess 2022:10.3310/UDIR6682. [PMID: 36562494 PMCID: PMC10068584 DOI: 10.3310/udir6682] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. METHODS We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network ( CNN ), stacked CNN , parallel-stacked CNN , recurrent neural network ( RNN ) and CNN-RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. RESULTS We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as 'relevant' and 15,935 (96%) were tagged as 'irrelevant'. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN , yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. FUTURE WORK AND LIMITATIONS We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. CONCLUSION Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. FUNDING This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
| | - Rachel Court
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Jodie Enderby
- Warwick Medical School, University of Warwick, Coventry, UK
| | | | - Chidozie Nduka
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Hema Mistry
- Warwick Medical School, University of Warwick, Coventry, UK
| | - G J Melendez-Torres
- Peninsula Technology Assessment Group (PenTAG), College of Medicine and Health, University of Exeter, Exeter, UK
| | | | - Aileen Clarke
- Warwick Medical School, University of Warwick, Coventry, UK
| |
Collapse
|
16
|
Cowie K, Rahmatullah A, Hardy N, Holub K, Kallmes K. Web-Based Software Tools for Systematic Literature Review in Medicine: Systematic Search and Feature Analysis. JMIR Med Inform 2022; 10:e33219. [PMID: 35499859 PMCID: PMC9112080 DOI: 10.2196/33219] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 01/06/2022] [Accepted: 03/12/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Systematic reviews (SRs) are central to evaluating therapies but have high costs in terms of both time and money. Many software tools exist to assist with SRs, but most tools do not support the full process, and transparency and replicability of SR depend on performing and presenting evidence according to established best practices. OBJECTIVE This study aims to provide a basis for comparing and selecting between web-based software tools that support SR, by conducting a feature-by-feature comparison of SR tools. METHODS We searched for SR tools by reviewing any such tool listed in the SR Toolbox, previous reviews of SR tools, and qualitative Google searching. We included all SR tools that were currently functional and required no coding, and excluded reference managers, desktop applications, and statistical software. The list of features to assess was populated by combining all features assessed in 4 previous reviews of SR tools; we also added 5 features (manual addition, screening automation, dual extraction, living review, and public outputs) that were independently noted as best practices or enhancements of transparency and replicability. Then, 2 reviewers assigned binary present or absent assessments to all SR tools with respect to all features, and a third reviewer adjudicated all disagreements. RESULTS Of the 53 SR tools found, 55% (29/53) were excluded, leaving 45% (24/53) for assessment. In total, 30 features were assessed across 6 classes, and the interobserver agreement was 86.46%. Giotto Compliance (27/30, 90%), DistillerSR (26/30, 87%), and Nested Knowledge (26/30, 87%) support the most features, followed by EPPI-Reviewer Web (25/30, 83%), LitStream (23/30, 77%), JBI SUMARI (21/30, 70%), and SRDB.PRO (VTS Software) (21/30, 70%). Fewer than half of all the features assessed are supported by 7 tools: RobotAnalyst (National Centre for Text Mining), SRDR (Agency for Healthcare Research and Quality), SyRF (Systematic Review Facility), Data Abstraction Assistant (Center for Evidence Synthesis in Health), SR Accelerator (Institute for Evidence-Based Healthcare), RobotReviewer (RobotReviewer), and COVID-NMA (COVID-NMA). Notably, of the 24 tools, only 10 (42%) support direct search, only 7 (29%) offer dual extraction, and only 13 (54%) offer living/updatable reviews. CONCLUSIONS DistillerSR, Nested Knowledge, and EPPI-Reviewer Web each offer a high density of SR-focused web-based tools. By transparent comparison and discussion regarding SR tool functionality, the medical community can both choose among existing software offerings and note the areas of growth needed, most notably in the support of living reviews.
Collapse
Affiliation(s)
| | | | | | - Karl Holub
- Nested Knowledge, Saint Paul, MN, United States
| | | |
Collapse
|
17
|
Lunny C, Reid EK, Neelakant T, Chen A, Zhang JH, Shinger G, Stevens A, Tasnim S, Sadeghipouya S, Adams S, Zheng YW, Lin L, Yang PH, Dosanjh M, Ngsee P, Ellis U, Shea BJ, Wright JM. A new taxonomy was developed for overlap across 'overviews of systematic reviews': A meta-research study of research waste. Res Synth Methods 2022; 13:315-329. [PMID: 34927388 PMCID: PMC9303867 DOI: 10.1002/jrsm.1542] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 12/15/2021] [Accepted: 12/16/2021] [Indexed: 11/09/2022]
Abstract
Multiple 'overviews of reviews' conducted on the same topic ("overlapping overviews") represent a waste of research resources and can confuse clinicians making decisions amongst competing treatments. We aimed to assess the frequency and characteristics of overlapping overviews. MEDLINE, Epistemonikos and Cochrane Database of Systematic Reviews were searched for overviews that: synthesized reviews of health interventions and conducted systematic searches. Overlap was defined as: duplication of PICO eligibility criteria, and not reported as an update nor a replication. We categorized overview topics according to 22 WHO ICD-10 medical classifications, overviews as broad or narrow in scope, and overlap as identical, nearly identical, partial, or subsumed. Subsummation was defined as when broad overviews subsumed the populations, interventions and at least one outcome of another overview. Of 541 overviews included, 169 (31%) overlapped across similar PICO, fell within 13 WHO ICD-10 medical classifications, and 62 topics. 148/169 (88%) overlapping overviews were broad in scope. Fifteen overviews were classified as having nearly identical overlap (9%); 123 partial overlap (73%), and 31 subsumed (18%) others. One third of overviews overlapped in content and a majority covered broad topic areas. A multiplicity of overviews on the same topic adds to the ongoing waste of research resources, time, and effort across medical disciplines. Authors of overviews can use this study and the sample of overviews to identify gaps in the evidence for future analysis, and topics that are already studied, which do not need to be duplicated.
Collapse
Affiliation(s)
- Carole Lunny
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | | | - Trish Neelakant
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Royal College of SurgeonsIreland
| | - Alyssa Chen
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Jia He Zhang
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Gavindeep Shinger
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Adrienne Stevens
- Michael G. DeGroote Cochrane Canada Centre, Department of Health Research Methods, Evidence, and ImpactMcMaster UniversityOntarioCanada
| | - Sara Tasnim
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Shadi Sadeghipouya
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Stephen Adams
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Yi Wen Zheng
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Lester Lin
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Pei Hsuan Yang
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Manpreet Dosanjh
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Peter Ngsee
- Faculty of Pharmaceutical ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Ursula Ellis
- Woodward LibraryUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Beverley J. Shea
- Clinical Epidemiology ProgramOttawa Hospital Research Institute, University of OttawaOntarioCanada
| | - James M. Wright
- Cochrane Hypertension Review Group, Therapeutics Initiative, Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
18
|
Chai KEK, Lines RLJ, Gucciardi DF, Ng L. Research Screener: a machine learning tool to semi-automate abstract screening for systematic reviews. Syst Rev 2021; 10:93. [PMID: 33795003 PMCID: PMC8017894 DOI: 10.1186/s13643-021-01635-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Systematic reviews and meta-analyses provide the highest level of evidence to help inform policy and practice, yet their rigorous nature is associated with significant time and economic demands. The screening of titles and abstracts is the most time consuming part of the review process with analysts required review thousands of articles manually, taking on average 33 days. New technologies aimed at streamlining the screening process have provided initial promising findings, yet there are limitations with current approaches and barriers to the widespread use of these tools. In this paper, we introduce and report initial evidence on the utility of Research Screener, a semi-automated machine learning tool to facilitate abstract screening. METHODS Three sets of analyses (simulation, interactive and sensitivity) were conducted to provide evidence of the utility of the tool through both simulated and real-world examples. RESULTS Research Screener delivered a workload saving of between 60 and 96% across nine systematic reviews and two scoping reviews. Findings from the real-world interactive analysis demonstrated a time saving of 12.53 days compared to the manual screening, which equates to a financial saving of USD 2444. Conservatively, our results suggest that analysts who scan 50% of the total pool of articles identified via a systematic search are highly likely to have identified 100% of eligible papers. CONCLUSIONS In light of these findings, Research Screener is able to reduce the burden for researchers wishing to conduct a comprehensive systematic review without reducing the scientific rigour for which they strive to achieve.
Collapse
Affiliation(s)
- Kevin E K Chai
- Curtin Institute for Computation, Curtin University, Perth, Australia
- School of Population Health, Curtin University, Perth, Australia
| | - Robin L J Lines
- School of Allied Health, Curtin University, Perth, Australia
| | | | - Leo Ng
- School of Allied Health, Curtin University, Perth, Australia.
| |
Collapse
|
19
|
Rapid reviews: A critical perspective. ZEITSCHRIFT FUR EVIDENZ FORTBILDUNG UND QUALITAET IM GESUNDHEITSWESEN 2020; 158-159:22-27. [PMID: 33229254 DOI: 10.1016/j.zefq.2020.09.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 09/27/2020] [Accepted: 09/30/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND The high scientific uncertainty of many far-reaching and serious political decisions during the "coronavirus crisis" underpins the enormous importance of having evidence syntheses that are quickly available and at the same time reliable. As these requirements can only be insufficiently fulfilled by systematic reviews due to the high amount of time required, abbreviated evidence syntheses in the form of rapid reviews are becoming increasingly popular. PURPOSE This commentary aims to enhance methodological and methodical discussions and research about abbreviated evidence syntheses. METHODS A selective literature search and evaluation focussing on research dealing with rapid reviews. RESULTS In rapid reviews, a wide variety of methods can be used to speed up the process of literature search and evaluation, while at the same time maintaining the principles of methodological quality and transparent reporting. But do rapid reviews currently keep what they promise? We discuss the increasing trend towards rapid reviews, giving the currently available evidence on the topic some critical reflection. Following this discussion, we will finally derive demands that go beyond the topic of rapid reviews alone.
Collapse
|
20
|
Alharbi A, Stevenson M. Refining Boolean queries to identify relevant studies for systematic review updates. J Am Med Inform Assoc 2020; 27:1658-1666. [PMID: 33067630 PMCID: PMC7750994 DOI: 10.1093/jamia/ocaa148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 06/09/2020] [Accepted: 06/23/2020] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Systematic reviews are important in health care but are expensive to produce and maintain. The authors explore the use of automated transformations of Boolean queries to improve the identification of relevant studies for updates to systematic reviews. MATERIALS AND METHODS A set of query transformations, including operator substitution, query expansion, and query reduction, were used to iteratively modify the Boolean query used for the original systematic review. The most effective transformation at each stage is identified using information about the studies included and excluded from the original review. A dataset consisting of 22 systematic reviews was used for evaluation. Updated queries were evaluated using the included and excluded studies from the updated version of the review. Recall and precision were used as evaluation measures. RESULTS The updated queries were more effective than the ones used for the original review, in terms of both precision and recall. The overall number of documents retrieved was reduced by more than half, while the number of relevant documents found increased by 10.3%. CONCLUSIONS Identification of relevant studies for updates to systematic reviews can be carried out more effectively by using information about the included and excluded studies from the original review to produce improved Boolean queries. These updated queries reduce the overall number of documents retrieved while also increasing the number of relevant documents identified, thereby representing a considerable reduction in effort required by systematic reviewers.
Collapse
Affiliation(s)
- Amal Alharbi
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| | - Mark Stevenson
- Computer Science Department, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
21
|
Tran BX, Nghiem S, Afoakwah C, Ha GH, Doan LP, Nguyen TP, Le TT, Latkin CA, Ho CSH, Ho RCM. Global mapping of interventions to improve the quality of life of patients with cardiovascular diseases during 1990-2018. Health Qual Life Outcomes 2020; 18:254. [PMID: 32727479 PMCID: PMC7391613 DOI: 10.1186/s12955-020-01507-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2019] [Accepted: 07/22/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cardiovascular diseases (CVDs) have been the global health problems that cause a substantial burden for the patients and the society. Assessing the Quality of Life (QOL) of CVD patients is critical in the effectiveness evaluation of CVD treatments as well as in determining potential areas for enhancing health outcomes. Through the adoption of a combination of bibliometric approach and content analysis, publications trend and the common topics regarding interventions to improve QOL of CVD patients were searched and characterized to inform priority setting and policy development. METHODS Bibliographic data of publications published from 1990 to 2018 on interventions to improve QOL of CVD patients were retrieved from Web of Science. Network graphs illustrating the terms co-occurrence clusters were created by VOSviewer software. Latent Dirichlet Allocation approach was adopted to classify papers into major research topics. RESULTS A total of 6457 papers was analyzed. We found a substantial increase in the number of publications, citations, and the number of download times of papers in the last 5 years. There has been a rise in the number of papers related to intervention to increase quality of life among patients with CVD during 1990-2018. Conventional therapies (surgery and medication), and psychological, behavioral interventions were common research topics. Meanwhile, the number of papers evaluating economic effectiveness has not been as high as that of other topics. CONCLUSIONS The research areas among the scientific studies emphasized the importance of interdisciplinary and inter-sectoral approaches in both evaluation and intervention. Future research should be a focus on economic evaluation of intervention as well as interventions to reduce mental issues among people with CVD.
Collapse
Affiliation(s)
- Bach Xuan Tran
- Department of Health Economics, Institute for Preventive Medicine and Public Health, Hanoi Medical University, No.1 Ton That Tung street, Dong Da, Hanoi, Vietnam
- Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD USA
| | - Son Nghiem
- Centre for Applied Health Economics (CAHE), Griffith University, Brisbane, Australia
| | - Clifford Afoakwah
- Centre for Applied Health Economics (CAHE), Griffith University, Brisbane, Australia
| | - Giang Hai Ha
- Institute for Global Health Innovations, Duy Tan University, Da Nang, Vietnam
| | - Linh Phuong Doan
- Center of Excellence in Evidence-based Medicine, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam
| | - Thao Phuong Nguyen
- Center of Excellence in Behavioral Medicine, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam
| | - Tuan Thanh Le
- Echo-lab, Vietnam National Heart Institute, Bach Mai Hospital, Hanoi, Vietnam
| | - Carl A. Latkin
- Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD USA
| | - Cyrus S. H. Ho
- Department of Psychological Medicine, National University Hospital, Singapore, Singapore
| | - Roger C. M. Ho
- Center of Excellence in Behavioral Medicine, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam
- Department of Psychological Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Institute for Health Innovation and Technology (iHealthtech), National University of Singapore, Singapore, Singapore
| |
Collapse
|