1
|
Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Boosting efficiency in a clinical literature surveillance system with LightGBM. PLOS DIGITAL HEALTH 2024; 3:e0000299. [PMID: 39312500 PMCID: PMC11419392 DOI: 10.1371/journal.pdig.0000299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 08/14/2024] [Indexed: 09/25/2024]
Abstract
Given the suboptimal performance of Boolean searching to identify methodologically sound and clinically relevant studies in large bibliographic databases, exploring machine learning (ML) to efficiently classify studies is warranted. To boost the efficiency of a literature surveillance program, we used a large internationally recognized dataset of articles tagged for methodological rigor and applied an automated ML approach to train and test binary classification models to predict the probability of clinical research articles being of high methodologic quality. We trained over 12,000 models on a dataset of titles and abstracts of 97,805 articles indexed in PubMed from 2012-2018 which were manually appraised for rigor by highly trained research associates and rated for clinical relevancy by practicing clinicians. As the dataset is unbalanced, with more articles that do not meet the criteria for rigor, we used the unbalanced dataset and over- and under-sampled datasets. Models that maintained sensitivity for high rigor at 99% and maximized specificity were selected and tested in a retrospective set of 30,424 articles from 2020 and validated prospectively in a blinded study of 5253 articles. The final selected algorithm, combining a LightGBM (gradient boosting machine) model trained in each dataset, maintained high sensitivity and achieved 57% specificity in the retrospective validation test and 53% in the prospective study. The number of articles needed to read to find one that met appraisal criteria was 3.68 (95% CI 3.52 to 3.85) in the prospective study, compared with 4.63 (95% CI 4.50 to 4.77) when relying only on Boolean searching. Gradient-boosting ML models reduced the work required to classify high quality clinical research studies by 45%, improving the efficiency of literature surveillance and subsequent dissemination to clinicians and other evidence users.
Collapse
Affiliation(s)
- Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Elham Bagheri
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lori-Ann Linkins
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - R. Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lingyang Chu
- Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada
| | - Muhammad Afzal
- School of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
2
|
Jin L, Li C, Zhu Z, Zou S, Sun X. Mining LDA topics on construction engineering change risks based on graded evidence. PLoS One 2024; 19:e0303424. [PMID: 38900821 PMCID: PMC11189256 DOI: 10.1371/journal.pone.0303424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 04/25/2024] [Indexed: 06/22/2024] Open
Abstract
Engineering change (EC) risk may negatively impact project schedule, cost, quality, and stakeholder satisfaction. However, existing methods for managing EC risk have certain shortcomings in evidence selection and do not adequately consider the quality and reliability of evidence associated with EC risks. Evidence grading plays a crucial role in ensuring the reliability of decisions related to EC risks and can provide essential scientific and reliability support for decision-making. In order to explore the potential risks associated with architectural engineering changes (ECs) and identify the most significant ones, this study proposed a methodology that combines evidence grading theory and Latent Dirichlet Allocation (LDA) topic analysis means. Initially, the evidence-based grading theory served as the creation of a grading table for evidence sources related to EC risk. Specifically, we categorized the evidence sources into three levels based on their credibility. Subsequently, we selected evidence with higher credibility levels for textual analysis, utilizing the LDA topic model. This involved analyzing regulations, industry standards, and judgment documents related to EC, ultimately identifying the themes associated with EC risks. In addition, by combining EC risk topics with relevant literature, we identified factors influencing EC risks. Subsequently, we designed an expert survey questionnaire to determine the key risks and important risk topics associated with potential risks. The results show that by synthesizing information from both Class A and B evidence, a total of five prominent risk themes were identified, namely contract, technology, funds, personnel, and other hazards. Among them, the technical risk has the highest value, so it implies that the risk is the most important, and the key risks are engineering design defects, errors, and omissions.
Collapse
Affiliation(s)
- Lianghai Jin
- Hubei Key Laboratory of Hydro-Power Construction and Management, China Three Gorges University, Yichang, Hubei, China
- College of Hydraulic and Environmental Engineering, China Three Gorges University, Yichang, Hubei, China
- Construction Law Appraisal Center, China Three Gorges University, Yichang, Hubei, China
- Safety Production Standardization Review Center, China Three Gorges University, Yichang, Hubei, China
| | - Chenxi Li
- Hubei Key Laboratory of Hydro-Power Construction and Management, China Three Gorges University, Yichang, Hubei, China
- College of Hydraulic and Environmental Engineering, China Three Gorges University, Yichang, Hubei, China
| | - Zhongrong Zhu
- Hubei Key Laboratory of Hydro-Power Construction and Management, China Three Gorges University, Yichang, Hubei, China
- College of Hydraulic and Environmental Engineering, China Three Gorges University, Yichang, Hubei, China
- Safety Production Standardization Review Center, China Three Gorges University, Yichang, Hubei, China
| | - Songxiang Zou
- Hubei Key Laboratory of Hydro-Power Construction and Management, China Three Gorges University, Yichang, Hubei, China
- College of Hydraulic and Environmental Engineering, China Three Gorges University, Yichang, Hubei, China
| | - Xushu Sun
- Hubei Key Laboratory of Hydro-Power Construction and Management, China Three Gorges University, Yichang, Hubei, China
- College of Hydraulic and Environmental Engineering, China Three Gorges University, Yichang, Hubei, China
| |
Collapse
|
3
|
Lokker C, Bagheri E, Abdelkader W, Parrish R, Afzal M, Navarro T, Cotoi C, Germini F, Linkins L, Brian Haynes R, Chu L, Iorio A. Deep Learning to Refine the Identification of High-Quality Clinical Research Articles from the Biomedical Literature: Performance Evaluation. J Biomed Inform 2023; 142:104384. [PMID: 37164244 DOI: 10.1016/j.jbi.2023.104384] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 04/24/2023] [Accepted: 05/03/2023] [Indexed: 05/12/2023]
Abstract
BACKGROUND Identifying practice-ready evidence-based journal articles in medicine is a challenge due to the sheer volume of biomedical research publications. Newer approaches to support evidence discovery apply deep learning techniques to improve the efficiency and accuracy of classifying sound evidence. OBJECTIVE To determine how well deep learning models using variants of Bidirectional Encoder Representations from Transformers (BERT) identify high-quality evidence with high clinical relevance from the biomedical literature for consideration in clinical practice. METHODS We fine-tuned variations of BERT models (BERTBASE, BioBERT, BlueBERT, and PubMedBERT) and compared their performance in classifying articles based on methodological quality criteria. The dataset used for fine-tuning models included titles and abstracts of >160,000 PubMed records from 2012-2020 that were of interest to human health which had been manually labeled based on meeting established critical appraisal criteria for methodological rigor. The data was randomly divided into 80:10:10 sets for training, validating, and testing. In addition to using the full unbalanced set, the training data was randomly undersampled into four balanced datasets to assess performance and select the best performing model. For each of the four sets, one model that maintained sensitivity (recall) at ≥99% was selected and were ensembled. The best performing model was evaluated in a prospective, blinded test and applied to an established reference standard, the Clinical Hedges dataset. RESULTS In training, three of the four selected best performing models were trained using BioBERTBASE. The ensembled model did not boost performance compared with the best individual model. Hence a solo BioBERT-based model (named DL-PLUS) was selected for further testing as it was computationally more efficient. The model had high recall (>99%) and 60% to 77% specificity in a prospective evaluation conducted with blinded research associates and saved >60% of the work required to identify high quality articles. CONCLUSIONS Deep learning using pretrained language models and a large dataset of classified articles produced models with improved specificity while maintaining >99% recall. The resulting DL-PLUS model identifies high-quality, clinically relevant articles from PubMed at the time of publication. The model improves the efficiency of a literature surveillance program, which allows for faster dissemination of appraised research.
Collapse
Affiliation(s)
- Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.
| | - Elham Bagheri
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Muhammad Afzal
- Department of Computing, Birmingham City University, Birmingham, UK
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lori Linkins
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - R Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lingyang Chu
- Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
4
|
Šuster S, Baldwin T, Lau JH, Jimeno Yepes A, Martinez Iraola D, Otmakhova Y, Verspoor K. Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study. J Med Internet Res 2023; 25:e35568. [PMID: 36722350 PMCID: PMC10131699 DOI: 10.2196/35568] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 01/18/2023] [Accepted: 01/31/2023] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria. OBJECTIVE We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr). METHODS We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews. RESULTS Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F1 (P=.68; R=0.92) and imprecision at 0.75 F1 (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F1 in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F1. When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F1. We also found that the results varied depending on the supporting information that is provided as an input to the models. CONCLUSIONS Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis.
Collapse
Affiliation(s)
- Simon Šuster
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
| | - Timothy Baldwin
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia.,Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Jey Han Lau
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
| | - Antonio Jimeno Yepes
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia.,School of Computing Technologies, RMIT University, Melbourne, Australia
| | | | - Yulia Otmakhova
- School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
| | - Karin Verspoor
- School of Computing Technologies, RMIT University, Melbourne, Australia
| |
Collapse
|
5
|
Abdelkader W, Navarro T, Parrish R, Cotoi C, Germini F, Iorio A, Haynes RB, Lokker C. Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant Evidence From the Biomedical Literature: Systematic Review. JMIR Med Inform 2021; 9:e30401. [PMID: 34499041 PMCID: PMC8461527 DOI: 10.2196/30401] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/15/2021] [Accepted: 07/25/2021] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND The rapid growth of the biomedical literature makes identifying strong evidence a time-consuming task. Applying machine learning to the process could be a viable solution that limits effort while maintaining accuracy. OBJECTIVE The goal of the research was to summarize the nature and comparative performance of machine learning approaches that have been applied to retrieve high-quality evidence for clinical consideration from the biomedical literature. METHODS We conducted a systematic review of studies that applied machine learning techniques to identify high-quality clinical articles in the biomedical literature. Multiple databases were searched to July 2020. Extracted data focused on the applied machine learning model, steps in the development of the models, and model performance. RESULTS From 3918 retrieved studies, 10 met our inclusion criteria. All followed a supervised machine learning approach and applied, from a limited range of options, a high-quality standard for the training of their model. The results show that machine learning can achieve a sensitivity of 95% while maintaining a high precision of 86%. CONCLUSIONS Machine learning approaches perform well in retrieving high-quality clinical studies. Performance may improve by applying more sophisticated approaches such as active learning and unsupervised machine learning approaches.
Collapse
Affiliation(s)
- Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - R Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
6
|
Afzal M, Alam F, Malik KM, Malik GM. Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation. J Med Internet Res 2020; 22:e19810. [PMID: 33095174 PMCID: PMC7647812 DOI: 10.2196/19810] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Accepted: 09/24/2020] [Indexed: 01/09/2023] Open
Abstract
Background Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization. Objective Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs. Methods In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context–aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context–aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score. Results Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context–aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory. Conclusions By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.
Collapse
Affiliation(s)
- Muhammad Afzal
- Department of Software, Sejong University, Seoul, Republic of Korea.,Department of Computer Science & Engineering, School of Engineering and Computer Science, Oakland University, Rochester, MI, United States
| | - Fakhare Alam
- Department of Computer Science & Engineering, School of Engineering and Computer Science, Oakland University, Rochester, MI, United States
| | - Khalid Mahmood Malik
- Department of Computer Science & Engineering, School of Engineering and Computer Science, Oakland University, Rochester, MI, United States
| | - Ghaus M Malik
- Department of Neurosurgery, Henry Ford Hospital, Detroit, MI, United States
| |
Collapse
|
7
|
Deep Learning Based Biomedical Literature Classification Using Criteria of Scientific Rigor. ELECTRONICS 2020. [DOI: 10.3390/electronics9081253] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A major blockade to support the evidence-based clinical decision-making is accurately and efficiently recognizing appropriate and scientifically rigorous studies in the biomedical literature. We trained a multi-layer perceptron (MLP) model on a dataset with two textual features, title and abstract. The dataset consisting of 7958 PubMed citations classified in two classes: scientific rigor and non-rigor, is used to train the proposed model. We compare our model with other promising machine learning models such as Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosted Tree (GBT) approaches. Based on the higher cumulative score, deep learning was chosen and was tested on test datasets obtained by running a set of domain-specific queries. On the training dataset, the proposed deep learning model obtained significantly higher accuracy and AUC of 97.3% and 0.993, respectively, than the competitors, but was slightly lower in the recall of 95.1% as compared to GBT. The trained model sustained the performance of testing datasets. Unlike previous approaches, the proposed model does not require a human expert to create fresh annotated data; instead, we used studies cited in Cochrane reviews as a surrogate for quality studies in a clinical topic. We learn that deep learning methods are beneficial to use for biomedical literature classification. Not only do such methods minimize the workload in feature engineering, but they also show better performance on large and noisy data.
Collapse
|
8
|
Afzal M, Hussain M, Malik KM, Lee S. Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study. JMIR Med Inform 2019; 7:e13430. [PMID: 31815673 PMCID: PMC6928703 DOI: 10.2196/13430] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 08/07/2019] [Accepted: 09/26/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The quality of health care is continuously improving and is expected to improve further because of the advancement of machine learning and knowledge-based techniques along with innovation and availability of wearable sensors. With these advancements, health care professionals are now becoming more interested and involved in seeking scientific research evidence from external sources for decision making relevant to medical diagnosis, treatments, and prognosis. Not much work has been done to develop methods for unobtrusive and seamless curation of data from the biomedical literature. OBJECTIVE This study aimed to design a framework that can enable bringing quality publications intelligently to the users' desk to assist medical practitioners in answering clinical questions and fulfilling their informational needs. METHODS The proposed framework consists of methods for efficient biomedical literature curation, including the automatic construction of a well-built question, the recognition of evidence quality by proposing extended quality recognition model (E-QRM), and the ranking and summarization of the extracted evidence. RESULTS Unlike previous works, the proposed framework systematically integrates the echelons of biomedical literature curation by including methods for searching queries, content quality assessments, and ranking and summarization. Using an ensemble approach, our high-impact classifier E-QRM obtained significantly improved accuracy than the existing quality recognition model (1723/1894, 90.97% vs 1462/1894, 77.21%). CONCLUSIONS Our proposed methods and evaluation demonstrate the validity and rigorousness of the results, which can be used in different applications, including evidence-based medicine, precision medicine, and medical education.
Collapse
Affiliation(s)
- Muhammad Afzal
- Department of Software, Sejong University, Seoul, Republic of Korea.,Department of Computer Science and Engineering, Oakland University, Rochester, MI, United States
| | - Maqbool Hussain
- Department of Software, Sejong University, Seoul, Republic of Korea
| | - Khalid Mahmood Malik
- Department of Computer Science and Engineering, Oakland University, Rochester, MI, United States
| | - Sungyoung Lee
- Department of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of Korea
| |
Collapse
|