1
|
Ghofrani A, Taherdoost H. Biomedical data analytics for better patient outcomes. Drug Discov Today 2024; 30:104280. [PMID: 39732322 DOI: 10.1016/j.drudis.2024.104280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 12/16/2024] [Accepted: 12/20/2024] [Indexed: 12/30/2024]
Abstract
Medical professionals today have access to immense amounts of data, which enables them to make decisions that enhance patient care and treatment efficacy. This innovative strategy can improve global health care by bridging the divide between clinical practice and medical research. This paper reviews biomedical developments aimed at improving patient outcomes by addressing three main questions regarding techniques, data sources and challenges. The review includes peer-reviewed articles from 2018 to 2023, found via systematic searches in PubMed, Scopus and Google Scholar. The results show diverse disease-specific applications. Challenges such as data quality and ethics are discussed, underscoring data analytics' potential for patient-focused health care. The review concludes that successful implementation requires addressing gaps, collaboration and innovation in biomedical science and data analytics.
Collapse
Affiliation(s)
| | - Hamed Taherdoost
- Hamta Business Corporation, Vancouver, Canada; University Canada West, Vancouver, Canada; Westcliff University, Irvine, USA; GUS Institute | Global University Systems, London, UK.
| |
Collapse
|
2
|
Araujo Gomes GJ, Beltrão FEDL, Fragoso WD, Lemos SG. Discrimination between Covid-19 positive and negative blood serum based on excitation-emission matrix fluorescence spectroscopy and chemometrics. Talanta 2024; 280:126788. [PMID: 39216418 DOI: 10.1016/j.talanta.2024.126788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 08/26/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024]
Abstract
The outbreak of the disease caused by the SARS-CoV-2 virus (Covid-19) has resulted in a global health emergency that has caused millions of deaths in recent years. The control of the pandemic was significantly impacted by the availability of inputs and qualified labor to correctly diagnose the population. The challenges faced by numerous countries in conducting this extensive diagnosis, utilizing methods such as RT-PCR, emphasize the necessity for alternative testing strategies that are less reliant on expensive raw materials and can be implemented on a larger scale. This paper proposes a methodology for classifying blood serum samples as either positive or negative for Covid-19 infection using excitation-emission matrix (EEM) fluorescence spectroscopy associated with multivariate analysis. The proposed methodology uses EEM spectra of samples diagnosed by the reference method (RT-PCR) to train and validate classification models. Two approaches were tested: the first using PARAFAC and the second by unfolding the excitation-emission matrices. The DD-SIMCA model performed best in the PARAFAC approach, with an error rate of 0.05, sensitivity of 0.98 and specificity of 0.96. The PLS-DA and PCA-DA models in the second approach effectively distinguished between classes. The PCA-DA model performed the best with an error rate of 0.06 and sensitivity and specificity of 0.94. Fluorescence spectroscopy was found to be effective in analyzing serum samples and obtaining discrimination models to determine if a patient is infected with SARS-CoV-2. The findings are encouraging and could aid in the development of an inexpensive and reliable auxiliary diagnostic method.
Collapse
Affiliation(s)
- Glaucio Jefferson Araujo Gomes
- Advanced Analytical Chemistry Research Group, Department of Chemistry, Federal University of Paraíba, C.P. 5093, 58051-970, João Pessoa, PB, Brazil
| | | | - Wallace Duarte Fragoso
- Advanced Analytical Chemistry Research Group, Department of Chemistry, Federal University of Paraíba, C.P. 5093, 58051-970, João Pessoa, PB, Brazil
| | - Sherlan Guimarães Lemos
- Advanced Analytical Chemistry Research Group, Department of Chemistry, Federal University of Paraíba, C.P. 5093, 58051-970, João Pessoa, PB, Brazil.
| |
Collapse
|
3
|
Rayan RA, Suruliandi A, Raja SP. Modified mutual information feature selection algorithm to predict COVID-19 using clinical data. Comput Methods Biomech Biomed Engin 2024:1-21. [PMID: 39568329 DOI: 10.1080/10255842.2024.2429012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 05/11/2024] [Accepted: 11/08/2024] [Indexed: 11/22/2024]
Abstract
The COVID-19 pandemic has profoundly impacted health, emphasizing the need for timely disease detection. Blood tests have become key diagnostic tools due to the virus's effects on blood composition. Accurate COVID-19 prediction through machine learning requires selecting relevant features, as irrelevant features can lower classification accuracy. This study proposes Modified Mutual Information (MMI) for feature selection, ranking features by relevance and using backtracking to find the optimal subset. Support Vector Machines (SVM) are then used for classification. Results show that MMI with SVM achieves 95% accuracy, outperforming other methods, and demonstrates strong generalizability on various benchmark datasets.
Collapse
Affiliation(s)
- R Ame Rayan
- Department of Computer Science and Engineering, Manonmaniam Sundaranar University, Tirunelveli, India
| | - A Suruliandi
- Department of Computer Science and Engineering, Manonmaniam Sundaranar University, Tirunelveli, India
| | - S P Raja
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
4
|
Tutsoy O, Koç GG. Deep self-supervised machine learning algorithms with a novel feature elimination and selection approaches for blood test-based multi-dimensional health risks classification. BMC Bioinformatics 2024; 25:103. [PMID: 38459463 PMCID: PMC10921629 DOI: 10.1186/s12859-024-05729-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 03/04/2024] [Indexed: 03/10/2024] Open
Abstract
BACKGROUND Blood test is extensively performed for screening, diagnoses and surveillance purposes. Although it is possible to automatically evaluate the raw blood test data with the advanced deep self-supervised machine learning approaches, it has not been profoundly investigated and implemented yet. RESULTS This paper proposes deep machine learning algorithms with multi-dimensional adaptive feature elimination, self-feature weighting and novel feature selection approaches. To classify the health risks based on the processed data with the deep layers, four machine learning algorithms having various properties from being utterly model free to gradient driven are modified. CONCLUSIONS The results show that the proposed deep machine learning algorithms can remove the unnecessary features, assign self-importance weights, selects their most informative ones and classify the health risks automatically from the worst-case low to worst-case high values.
Collapse
Affiliation(s)
- Onder Tutsoy
- Adana Alparslan Turkes Science and Technology University, Adana, Turkey.
| | - Gizem Gul Koç
- Adana Alparslan Turkes Science and Technology University, Adana, Turkey
| |
Collapse
|
5
|
Sagar D, Dwivedi T, Gupta A, Aggarwal P, Bhatnagar S, Mohan A, Kaur P, Gupta R. Clinical Features Predicting COVID-19 Severity Risk at the Time of Hospitalization. Cureus 2024; 16:e57336. [PMID: 38690475 PMCID: PMC11059179 DOI: 10.7759/cureus.57336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/31/2024] [Indexed: 05/02/2024] Open
Abstract
The global spread of COVID-19 has led to significant mortality and morbidity worldwide. Early identification of COVID-19 patients who are at high risk of developing severe disease can help in improved patient management, care, and treatment, as well as in the effective allocation of hospital resources. The severity prediction at the time of hospitalization can be extremely helpful in deciding the treatment of COVID-19 patients. To this end, this study presents an interpretable artificial intelligence (AI) model, named COVID-19 severity predictor (CoSP) that predicts COVID-19 severity using the clinical features at the time of hospital admission. We utilized a dataset comprising 64 demographic and laboratory features of 7,416 confirmed COVID-19 patients that were collected at the time of hospital admission. The proposed hierarchical CoSP model performs four-class COVID severity risk prediction into asymptomatic, mild, moderate, and severe categories. CoSP yielded better performance with good interpretability, as observed via Shapley analysis on COVID severity prediction compared to the other popular ML methods, with an area under the received operating characteristic curve (AUC-ROC) of 0.95, an area under the precision-recall curve (AUPRC) of 0.91, and a weighted F1-score of 0.83. Out of 64 initial features, 19 features were inferred as predictive of the severity of COVID-19 disease by the CoSP model. Therefore, an AI model predicting COVID-19 severity may be helpful for early intervention, optimizing resource allocation, and guiding personalized treatments, potentially enabling healthcare professionals to save lives and allocate resources effectively in the fight against the pandemic.
Collapse
Affiliation(s)
- Dikshant Sagar
- Computer Science, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
- Computer Science, Calfornia State University, Los Angeles, Los Angeles, USA
| | - Tanima Dwivedi
- Oncology, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| | - Anubha Gupta
- Centre of Excellence in Healthcare, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
| | - Priya Aggarwal
- Electronics and Communication Engineering, Indraprastha Institute of Information Technology - Delhi, Delhi, IND
| | - Sushma Bhatnagar
- Onco-Anaesthesia and Palliative Medicine, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| | - Anant Mohan
- Pulmonary, Critical Care and Sleep Medicine, All India Institute of Medical Sciences, New Delhi, IND
| | - Punit Kaur
- Biophysics, All India Institute of Medical Sciences, New Delhi, IND
| | - Ritu Gupta
- Oncology, Dr. B.R.A Institute-Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, IND
| |
Collapse
|
6
|
Farahat IS, Sharafeldeen A, Ghazal M, Alghamdi NS, Mahmoud A, Connelly J, van Bogaert E, Zia H, Tahtouh T, Aladrousy W, Tolba AE, Elmougy S, El-Baz A. An AI-based novel system for predicting respiratory support in COVID-19 patients through CT imaging analysis. Sci Rep 2024; 14:851. [PMID: 38191606 PMCID: PMC10774502 DOI: 10.1038/s41598-023-51053-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 12/29/2023] [Indexed: 01/10/2024] Open
Abstract
The proposed AI-based diagnostic system aims to predict the respiratory support required for COVID-19 patients by analyzing the correlation between COVID-19 lesions and the level of respiratory support provided to the patients. Computed tomography (CT) imaging will be used to analyze the three levels of respiratory support received by the patient: Level 0 (minimum support), Level 1 (non-invasive support such as soft oxygen), and Level 2 (invasive support such as mechanical ventilation). The system will begin by segmenting the COVID-19 lesions from the CT images and creating an appearance model for each lesion using a 2D, rotation-invariant, Markov-Gibbs random field (MGRF) model. Three MGRF-based models will be created, one for each level of respiratory support. This suggests that the system will be able to differentiate between different levels of severity in COVID-19 patients. The system will decide for each patient using a neural network-based fusion system, which combines the estimates of the Gibbs energy from the three MGRF-based models. The proposed system were assessed using 307 COVID-19-infected patients, achieving an accuracy of [Formula: see text], a sensitivity of [Formula: see text], and a specificity of [Formula: see text], indicating a high level of prediction accuracy.
Collapse
Affiliation(s)
- Ibrahim Shawky Farahat
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | | | - Mohammed Ghazal
- Electrical, Computer and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi, UAE
| | - Norah Saleh Alghamdi
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Ali Mahmoud
- Department of Bioengineering, University of Louisville, Louisville, USA
| | - James Connelly
- Department of Radiology, University of Louisville, Louisville, USA
| | - Eric van Bogaert
- Department of Radiology, University of Louisville, Louisville, USA
| | - Huma Zia
- Electrical, Computer and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi, UAE
| | - Tania Tahtouh
- College of Health Sciences, Abu Dhabi University, Abu Dhabi, UAE
| | - Waleed Aladrousy
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Ahmed Elsaid Tolba
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
- The Higher Institute of Engineering and Automotive Technology and Energy, Kafr El Sheikh, Egypt
| | - Samir Elmougy
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Ayman El-Baz
- Department of Bioengineering, University of Louisville, Louisville, USA.
| |
Collapse
|
7
|
Tehrani SSM, Zarvani M, Amiri P, Ghods Z, Raoufi M, Safavi-Naini SAA, Soheili A, Gharib M, Abbasi H. Visual transformer and deep CNN prediction of high-risk COVID-19 infected patients using fusion of CT images and clinical data. BMC Med Inform Decis Mak 2023; 23:265. [PMID: 37978393 PMCID: PMC10656999 DOI: 10.1186/s12911-023-02344-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 10/16/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND Despite the globally reducing hospitalization rates and the much lower risks of Covid-19 mortality, accurate diagnosis of the infection stage and prediction of outcomes are clinically of interest. Advanced current technology can facilitate automating the process and help identifying those who are at higher risks of developing severe illness. This work explores and represents deep-learning-based schemes for predicting clinical outcomes in Covid-19 infected patients, using Visual Transformer and Convolutional Neural Networks (CNNs), fed with 3D data fusion of CT scan images and patients' clinical data. METHODS We report on the efficiency of Video Swin Transformers and several CNN models fed with fusion datasets and CT scans only vs. a set of conventional classifiers fed with patients' clinical data only. A relatively large clinical dataset from 380 Covid-19 diagnosed patients was used to train/test the models. RESULTS Results show that the 3D Video Swin Transformers fed with the fusion datasets of 64 sectional CT scans + 67 clinical labels outperformed all other approaches for predicting outcomes in Covid-19-infected patients amongst all techniques (i.e., TPR = 0.95, FPR = 0.40, F0.5 score = 0.82, AUC = 0.77, Kappa = 0.6). CONCLUSIONS We demonstrate how the utility of our proposed novel 3D data fusion approach through concatenating CT scan images with patients' clinical data can remarkably improve the performance of the models in predicting Covid-19 infection outcomes. SIGNIFICANCE Findings indicate possibilities of predicting the severity of outcome using patients' CT images and clinical data collected at the time of admission to hospital.
Collapse
Affiliation(s)
| | - Maral Zarvani
- Faculty of Engineering, Alzahra University, Tehran, Iran
| | - Paria Amiri
- University of Erlangen-Nuremberg, Bavaria, Germany
| | - Zahra Ghods
- Faculty of Engineering, Alzahra University, Tehran, Iran
| | - Masoomeh Raoufi
- Department of Radiology, School of Medicine, Imam Hossein Hospital, Shahid Beheshti, University of Medical Sciences, Tehran, Iran
| | - Seyed Amir Ahmad Safavi-Naini
- Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Amirali Soheili
- School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | - Hamid Abbasi
- Auckland Bioengineering Institute, University of Auckland, Auckland, 1010, New Zealand.
| |
Collapse
|
8
|
Pisano F, Cannas B, Fanni A, Pasella M, Canetto B, Giglio SR, Mocci S, Chessa L, Perra A, Littera R. Decision trees for early prediction of inadequate immune response to coronavirus infections: a pilot study on COVID-19. Front Med (Lausanne) 2023; 10:1230733. [PMID: 37601789 PMCID: PMC10433226 DOI: 10.3389/fmed.2023.1230733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 07/19/2023] [Indexed: 08/22/2023] Open
Abstract
Introduction Few artificial intelligence models exist to predict severe forms of COVID-19. Most rely on post-infection laboratory data, hindering early treatment for high-risk individuals. Methods This study developed a machine learning model to predict inherent risk of severe symptoms after contracting SARS-CoV-2. Using a Decision Tree trained on 153 Alpha variant patients, demographic, clinical and immunogenetic markers were considered. Model performance was assessed on Alpha and Delta variant datasets. Key risk factors included age, gender, absence of KIR2DS2 gene (alone or with HLA-C C1 group alleles), presence of 14-bp polymorphism in HLA-G gene, presence of KIR2DS5 gene, and presence of KIR telomeric region A/A. Results The model achieved 83.01% accuracy for Alpha variant and 78.57% for Delta variant, with True Positive Rates of 80.82 and 77.78%, and True Negative Rates of 85.00% and 79.17%, respectively. The model showed high sensitivity in identifying individuals at risk. Discussion The present study demonstrates the potential of AI algorithms, combined with demographic, epidemiologic, and immunogenetic data, in identifying individuals at high risk of severe COVID-19 and facilitating early treatment. Further studies are required for routine clinical integration.
Collapse
Affiliation(s)
- Fabio Pisano
- Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
| | - Barbara Cannas
- Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
| | - Alessandra Fanni
- Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
| | - Manuela Pasella
- Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
| | | | - Sabrina Rita Giglio
- Medical Genetics, Department of Medical Sciences and Public Health, University of Cagliari, Cagliari, Italy
- AART-ODV (Association for the Advancement of Research on Transplantation), Cagliari, Italy
- Medical Genetics, R. Binaghi Hospital, Local Public Health and Social Care Unit (ASSL) of Cagliari, Cagliari, Italy
- Centre for Research University Services (CeSAR, Centro Servizi di Ateneo per la Ricerca), University of Cagliari, Cagliari, Monserrato, Italy
| | - Stefano Mocci
- Medical Genetics, Department of Medical Sciences and Public Health, University of Cagliari, Cagliari, Italy
- Centre for Research University Services (CeSAR, Centro Servizi di Ateneo per la Ricerca), University of Cagliari, Cagliari, Monserrato, Italy
| | - Luchino Chessa
- AART-ODV (Association for the Advancement of Research on Transplantation), Cagliari, Italy
- Department of Medical Sciences and Public Health, University of Cagliari, Cagliari, Italy
- Liver Unit, Department of Internal Medicine, University Hospital of Cagliari, Cagliari, Italy
| | - Andrea Perra
- AART-ODV (Association for the Advancement of Research on Transplantation), Cagliari, Italy
- Unit of Oncology and Molecular Pathology, Department of Biomedical Sciences, University of Cagliari, Cagliari, Italy
| | - Roberto Littera
- AART-ODV (Association for the Advancement of Research on Transplantation), Cagliari, Italy
- Medical Genetics, R. Binaghi Hospital, Local Public Health and Social Care Unit (ASSL) of Cagliari, Cagliari, Italy
| |
Collapse
|
9
|
Morís DI, de Moura J, Marcos PJ, Rey EM, Novo J, Ortega M. Comprehensive analysis of clinical data for COVID-19 outcome estimation with machine learning models. Biomed Signal Process Control 2023; 84:104818. [PMID: 36915863 PMCID: PMC9995330 DOI: 10.1016/j.bspc.2023.104818] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 11/22/2022] [Accepted: 03/05/2023] [Indexed: 03/11/2023]
Abstract
COVID-19 is a global threat for the healthcare systems due to the rapid spread of the pathogen that causes it. In such situation, the clinicians must take important decisions, in an environment where medical resources can be insufficient. In this task, the computer-aided diagnosis systems can be very useful not only in the task of supporting the clinical decisions but also to perform relevant analyses, allowing them to understand better the disease and the factors that can identify the high risk patients. For those purposes, in this work, we use several machine learning algorithms to estimate the outcome of COVID-19 patients given their clinical information. Particularly, we perform 2 different studies: the first one estimates whether the patient is at low or at high risk of death whereas the second estimates if the patient needs hospitalization or not. The results of the analyses of this work show the most relevant features for each studied scenario, as well as the classification performance of the considered machine learning models. In particular, the XGBoost algorithm is able to estimate the need for hospitalization of a patient with an AUC-ROC of 0 . 8415 ± 0 . 0217 while it can also estimate the risk of death with an AUC-ROC of 0 . 7992 ± 0 . 0104 . Results have demonstrated the great potential of the proposal to determine those patients that need a greater amount of medical resources for being at a higher risk. This provides the healthcare services with a tool to better manage their resources.
Collapse
Affiliation(s)
- Daniel I Morís
- Centro de Investigación CITIC, Universidade da Coruña, Campus de Elviña, s/n, 15071 A Coruña, Spain.,Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Xubias de Arriba, 84, 15006 A Coruña, Spain
| | - Joaquim de Moura
- Centro de Investigación CITIC, Universidade da Coruña, Campus de Elviña, s/n, 15071 A Coruña, Spain.,Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Xubias de Arriba, 84, 15006 A Coruña, Spain
| | - Pedro J Marcos
- Dirección Asistencial y Servicio de Neumología, Complejo Hospitalario Universitario de A Coruña (CHUAC), Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Sergas, 15006 A Coruña, Spain
| | - Enrique Míguez Rey
- Grupo de Investigación en Virología Clínica, Sección de Enfermedades Infecciosas, Servicio de Medicina Interna, Instituto de Investigación Biomédica de A Coruña (INIBIC), Área Sanitaria A Coruña y CEE (ASCC), SERGAS, 15006 A Coruña, Spain
| | - Jorge Novo
- Centro de Investigación CITIC, Universidade da Coruña, Campus de Elviña, s/n, 15071 A Coruña, Spain.,Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Xubias de Arriba, 84, 15006 A Coruña, Spain
| | - Marcos Ortega
- Centro de Investigación CITIC, Universidade da Coruña, Campus de Elviña, s/n, 15071 A Coruña, Spain.,Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Xubias de Arriba, 84, 15006 A Coruña, Spain
| |
Collapse
|
10
|
A machine learning and explainable artificial intelligence triage-prediction system for COVID-19. DECISION ANALYTICS JOURNAL 2023; 7:100246. [PMCID: PMC10163946 DOI: 10.1016/j.dajour.2023.100246] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 04/21/2023] [Accepted: 05/02/2023] [Indexed: 06/02/2024]
Abstract
COVID-19 is a respiratory disease caused by the SARS-CoV-2 contagion, severely disrupted the healthcare infrastructure. Various countries have developed COVID-19 vaccines that have effectively prevented the severe symptoms caused by the virus to a certain extent. However, a small section of people continues to perish. Artificial intelligence advances have revolutionized healthcare diagnosis and prognosis infrastructure. In this study, we predict the severity of COVID-19 using heterogenous Machine Learning and Deep Learning algorithms by considering clinical markers, vital signs, and other critical factors. This study extensively reviews various classifier architectures to predict the COVID-19 severity. We built and evaluated multiple pipelines entailing combinations of five state-of-the-art data-balancing techniques (Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic, Borderline SMOTE, SMOTE with Tomek links, and SMOTE with Edited Nearest Neighbor (ENN)) and twelve heterogeneous classifiers such as Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, Xgboost, Extratrees, Adaboost, Light GBM, Catboost, and 1-D Convolution Neural Network. The best-performing pipeline consists of Random Forest trained on Borderline SMOTE balanced data that produced the highest recall of 83%. We deployed Explainable Artificial Intelligence tools such as Shapley Additive Explanations and Local Interpretable Model-agnostic Explanations, ELI5, Qlattice, Anchor, and Feature Importance to demystify complex tree-based ensemble models. These tools provide valuable insights into the significance of critical features in the severity prediction of a COVID-19 patient. It was observed that changes in respiratory rate, blood pressure, lactate, and calcium values were the primary contributors to the increase in severity of a COVID-19 patient. This architecture aims to be an explainable decision-support triaging system for medical professionals in countries lacking advanced medical technology and infrastructure to reduce fatalities.
Collapse
|
11
|
Kessler R, Philipp J, Wilfer J, Kostev K. Predictive Attributes for Developing Long COVID-A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany. J Clin Med 2023; 12:jcm12103511. [PMID: 37240616 DOI: 10.3390/jcm12103511] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 04/25/2023] [Accepted: 05/15/2023] [Indexed: 05/28/2023] Open
Abstract
(1) In the present study, we used data comprising patient medical histories from a panel of primary care practices in Germany to predict post-COVID-19 conditions in patients after COVID-19 diagnosis and to evaluate the relevant factors associated with these conditions using machine learning methods. (2) Methods: Data retrieved from the IQVIATM Disease Analyzer database were used. Patients with at least one COVID-19 diagnosis between January 2020 and July 2022 were selected for inclusion in the study. Age, sex, and the complete history of diagnoses and prescription data before COVID-19 infection at the respective primary care practice were extracted for each patient. A gradient boosting classifier (LGBM) was deployed. The prepared design matrix was randomly divided into train (80%) and test data (20%). After optimizing the hyperparameters of the LGBM classifier by maximizing the F2 score, model performance was evaluated using several test metrics. We calculated SHAP values to evaluate the importance of the individual features, but more importantly, to evaluate the direction of influence of each feature in our dataset, i.e., whether it is positively or negatively associated with a diagnosis of long COVID. (3) Results: In both the train and test data sets, the model showed a high recall (sensitivity) of 81% and 72% and a high specificity of 80% and 80%; this was offset, however, by a moderate precision of 8% and 7% and an F2-score of 0.28 and 0.25. The most common predictive features identified using SHAP included COVID-19 variant, physician practice, age, distinct number of diagnoses and therapies, sick days ratio, sex, vaccination rate, somatoform disorders, migraine, back pain, asthma, malaise and fatigue, as well as cough preparations. (4) Conclusions: The present exploratory study describes an initial investigation of the prediction of potential features increasing the risk of developing long COVID after COVID-19 infection by using the patient history from electronic medical records before COVID-19 infection in primary care practices in Germany using machine learning. Notably, we identified several predictive features for the development of long COVID in patient demographics and their medical histories.
Collapse
Affiliation(s)
- Roman Kessler
- Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany
| | | | | | | |
Collapse
|
12
|
Abbasi Habashi S, Koyuncu M, Alizadehsani R. A Survey of COVID-19 Diagnosis Using Routine Blood Tests with the Aid of Artificial Intelligence Techniques. Diagnostics (Basel) 2023; 13:1749. [PMID: 37238232 PMCID: PMC10217633 DOI: 10.3390/diagnostics13101749] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 04/19/2023] [Accepted: 04/29/2023] [Indexed: 05/28/2023] Open
Abstract
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), causing a disease called COVID-19, is a class of acute respiratory syndrome that has considerably affected the global economy and healthcare system. This virus is diagnosed using a traditional technique known as the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. However, RT-PCR customarily outputs a lot of false-negative and incorrect results. Current works indicate that COVID-19 can also be diagnosed using imaging resolutions, including CT scans, X-rays, and blood tests. Nevertheless, X-rays and CT scans cannot always be used for patient screening because of high costs, radiation doses, and an insufficient number of devices. Therefore, there is a requirement for a less expensive and faster diagnostic model to recognize the positive and negative cases of COVID-19. Blood tests are easily performed and cost less than RT-PCR and imaging tests. Since biochemical parameters in routine blood tests vary during the COVID-19 infection, they may supply physicians with exact information about the diagnosis of COVID-19. This study reviewed some newly emerging artificial intelligence (AI)-based methods to diagnose COVID-19 using routine blood tests. We gathered information about research resources and inspected 92 articles that were carefully chosen from a variety of publishers, such as IEEE, Springer, Elsevier, and MDPI. Then, these 92 studies are classified into two tables which contain articles that use machine Learning and deep Learning models to diagnose COVID-19 while using routine blood test datasets. In these studies, for diagnosing COVID-19, Random Forest and logistic regression are the most widely used machine learning methods and the most widely used performance metrics are accuracy, sensitivity, specificity, and AUC. Finally, we conclude by discussing and analyzing these studies which use machine learning and deep learning models and routine blood test datasets for COVID-19 detection. This survey can be the starting point for a novice-/beginner-level researcher to perform on COVID-19 classification.
Collapse
Affiliation(s)
| | - Murat Koyuncu
- Department of Information Systems Engineering, Atilim University, 06830 Ankara, Turkey;
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, Geelong, VIC 3216, Australia
| |
Collapse
|
13
|
Rahman T, Chowdhury MEH, Khandakar A, Mahbub ZB, Hossain MSA, Alhatou A, Abdalla E, Muthiyal S, Islam KF, Kashem SBA, Khan MS, Zughaier SM, Hossain M. BIO-CXRNET: a robust multimodal stacking machine learning technique for mortality risk prediction of COVID-19 patients using chest X-ray images and clinical data. Neural Comput Appl 2023; 35:1-23. [PMID: 37362565 PMCID: PMC10157130 DOI: 10.1007/s00521-023-08606-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 04/11/2023] [Indexed: 06/28/2023]
Abstract
Nowadays, quick, and accurate diagnosis of COVID-19 is a pressing need. This study presents a multimodal system to meet this need. The presented system employs a machine learning module that learns the required knowledge from the datasets collected from 930 COVID-19 patients hospitalized in Italy during the first wave of COVID-19 (March-June 2020). The dataset consists of twenty-five biomarkers from electronic health record and Chest X-ray (CXR) images. It is found that the system can diagnose low- or high-risk patients with an accuracy, sensitivity, and F1-score of 89.03%, 90.44%, and 89.03%, respectively. The system exhibits 6% higher accuracy than the systems that employ either CXR images or biomarker data. In addition, the system can calculate the mortality risk of high-risk patients using multivariate logistic regression-based nomogram scoring technique. Interested physicians can use the presented system to predict the early mortality risks of COVID-19 patients using the web-link: Covid-severity-grading-AI. In this case, a physician needs to input the following information: CXR image file, Lactate Dehydrogenase (LDH), Oxygen Saturation (O2%), White Blood Cells Count, C-reactive protein, and Age. This way, this study contributes to the management of COVID-19 patients by predicting early mortality risk. Supplementary Information The online version contains supplementary material available at 10.1007/s00521-023-08606-w.
Collapse
Affiliation(s)
- Tawsifur Rahman
- Department of Electrical Engineering, Qatar University, P.O. Box 2713, Doha, Qatar
| | | | - Amith Khandakar
- Department of Electrical Engineering, Qatar University, P.O. Box 2713, Doha, Qatar
| | - Zaid Bin Mahbub
- Department of Physics and Mathematics, North South University, Dhaka, 1229 Bangladesh
| | | | - Abraham Alhatou
- Department of Biology, University of South Carolina (USC), Columbia, SC 29208 USA
| | - Eynas Abdalla
- Anesthesia Department, Hamad General Hospital, P.O. Box 3050, Doha, Qatar
| | - Sreekumar Muthiyal
- Department of Radiology, Hamad General Hospital, P.O. Box 3050, Doha, Qatar
| | | | - Saad Bin Abul Kashem
- Department of Computer Science, AFG College with the University of Aberdeen, Doha, Qatar
| | - Muhammad Salman Khan
- Department of Electrical Engineering, Qatar University, P.O. Box 2713, Doha, Qatar
| | - Susu M. Zughaier
- Department of Basic Medical Sciences, College of Medicine, QU Health, Qatar University, P.O. Box 2713, Doha, Qatar
| | - Maqsud Hossain
- NSU Genome Research Institute (NGRI), North South University, Dhaka, 1229 Bangladesh
| |
Collapse
|
14
|
Harrou F, Dairi A, Dorbane A, Kadri F, Sun Y. Semi-Supervised KPCA-Based Monitoring Techniques for Detecting COVID-19 Infection through Blood Tests. Diagnostics (Basel) 2023; 13:1466. [PMID: 37189568 PMCID: PMC10138088 DOI: 10.3390/diagnostics13081466] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 04/07/2023] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open
Abstract
This study introduces a new method for identifying COVID-19 infections using blood test data as part of an anomaly detection problem by combining the kernel principal component analysis (KPCA) and one-class support vector machine (OCSVM). This approach aims to differentiate healthy individuals from those infected with COVID-19 using blood test samples. The KPCA model is used to identify nonlinear patterns in the data, and the OCSVM is used to detect abnormal features. This approach is semi-supervised as it uses unlabeled data during training and only requires data from healthy cases. The method's performance was tested using two sets of blood test samples from hospitals in Brazil and Italy. Compared to other semi-supervised models, such as KPCA-based isolation forest (iForest), local outlier factor (LOF), elliptical envelope (EE) schemes, independent component analysis (ICA), and PCA-based OCSVM, the proposed KPCA-OSVM approach achieved enhanced discrimination performance for detecting potential COVID-19 infections. For the two COVID-19 blood test datasets that were considered, the proposed approach attained an AUC (area under the receiver operating characteristic curve) of 0.99, indicating a high accuracy level in distinguishing between positive and negative samples based on the test results. The study suggests that this approach is a promising solution for detecting COVID-19 infections without labeled data.
Collapse
Affiliation(s)
- Fouzi Harrou
- Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Abdelkader Dairi
- Computer Science Department, University of Science and Technology of Oran-Mohamed Boudiaf (USTO-MB), El Mnaouar, BP 1505, Oran 31000, Algeria;
| | - Abdelhakim Dorbane
- Smart Structures Laboratory (SSL), Department of Mechanical Engineering, Belhadj Bouchaib University of Ain Temouchent, Ain Temouchent 46000, Algeria
| | - Farid Kadri
- Aeroline DATA & CET, Agence 1031, Sopra Steria Group, 31770 Colomiers, France
| | - Ying Sun
- Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
15
|
Machine learning to analyse omic-data for COVID-19 diagnosis and prognosis. BMC Bioinformatics 2023; 24:7. [PMID: 36609221 PMCID: PMC9817417 DOI: 10.1186/s12859-022-05127-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 12/23/2022] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND With the global spread of COVID-19, the world has seen many patients, including many severe cases. The rapid development of machine learning (ML) has made significant disease diagnosis and prediction achievements. Current studies have confirmed that omics data at the host level can reflect the development process and prognosis of the disease. Since early diagnosis and effective treatment of severe COVID-19 patients remains challenging, this research aims to use omics data in different ML models for COVID-19 diagnosis and prognosis. We used several ML models on omics data of a large number of individuals to first predict whether patients are COVID-19 positive or negative, followed by the severity of the disease. RESULTS On the COVID-19 diagnosis task, we got the best AUC of 0.99 with our multilayer perceptron model and the highest F1-score of 0.95 with our logistic regression (LR) model. For the severity prediction task, we achieved the highest accuracy of 0.76 with an LR model. Beyond classification and predictive modeling, our study founds ML models performed better on integrated multi-omics data, rather than single omics. By comparing top features from different omics dataset, we also found the robustness of our model, with a wider range of applicability in diverse dataset related to COVID-19. Additionally, we have found that omics-based models performed better than image or physiological feature-based models, proving the importance of the omics-based dataset for future model development. CONCLUSIONS This study diagnoses COVID-19 positive cases and predicts accurate severity levels. It lowers the dependence on clinical data and professional judgment, by leveraging the utilization of state-of-the-art models. our model showed wider applicability across different omics dataset, which is highly transferable in other respiratory or similar diseases. Hospital and public health care mechanisms can optimize the distribution of medical resources and improve the robustness of the medical system.
Collapse
|
16
|
Kistenev YV, Vrazhnov DA, Shnaider EE, Zuhayri H. Predictive models for COVID-19 detection using routine blood tests and machine learning. Heliyon 2022; 8:e11185. [PMID: 36311357 PMCID: PMC9595489 DOI: 10.1016/j.heliyon.2022.e11185] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 03/25/2022] [Accepted: 10/16/2022] [Indexed: 11/06/2022] Open
Abstract
The problem of accurate, fast, and inexpensive COVID-19 tests has been urgent till now. Standard COVID-19 tests need high-cost reagents and specialized laboratories with high safety requirements, are time-consuming. Data of routine blood tests as a base of SARS-CoV-2 invasion detection allows using the most practical medicine facilities. But blood tests give general information about a patient's state, which is not directly associated with COVID-19. COVID-19-specific features should be selected from the list of standard blood characteristics, and decision-making software based on appropriate clinical data should be created. This review describes the abilities to develop predictive models for COVID-19 detection using routine blood tests and machine learning.
Collapse
Affiliation(s)
- Yury V. Kistenev
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Denis A. Vrazhnov
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Ekaterina E. Shnaider
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| | - Hala Zuhayri
- Laboratory of Laser Molecular Imaging and Machine Learning, Tomsk State University, 36 Lenin Av., 634050 Tomsk, Russia
| |
Collapse
|
17
|
Araújo DC, Veloso AA, Borges KBG, Carvalho MDG. Prognosing the risk of COVID-19 death through a machine learning-based routine blood panel: A retrospective study in Brazil. Int J Med Inform 2022; 165:104835. [PMID: 35908372 PMCID: PMC9327247 DOI: 10.1016/j.ijmedinf.2022.104835] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/17/2022] [Accepted: 07/19/2022] [Indexed: 01/08/2023]
Abstract
BACKGROUND Despite an extensive network of primary care availability, Brazil has suffered profoundly during the COVID-19 pandemic, experiencing the greatest sanitary collapse in its history. Thus, it is important to understand phenotype risk factors for SARS-CoV-2 infection severity in the Brazilian population in order to provide novel insights into the pathogenesis of the disease. OBJECTIVE This study proposes to predict the risk of COVID-19 death through machine learning, using blood biomarkers data from patients admitted to two large hospitals in Brazil. METHODS We retrospectively collected blood biomarkers data in a 24-h time window from 6,979 patients with COVID-19 confirmed by positive RT-PCR admitted to two large hospitals in Brazil, of whom 291 (4.2%) died and 6,688 (95.8%) were discharged. We then developed a large-scale exploration of risk models to predict the probability of COVID-19 severity, finally choosing the best performing model regarding the average AUROC. To improve generalizability, for each model five different testing scenarios were conducted, including two external validations. RESULTS We developed a machine learning-based panel composed of parameters extracted from the complete blood count (lymphocytes, MCV, platelets and RDW), in addition to C-Reactive Protein, which yielded an average AUROC of 0.91 ± 0.01 to predict death by COVID-19 confirmed by positive RT-PCR within a 24-h window. CONCLUSION Our study suggests that routine laboratory variables could be useful to identify COVID-19 patients under higher risk of death using machine learning. Further studies are needed for validating the model in other populations and contexts, since the natural history of SARS-CoV-2 infection and its consequences on the hematopoietic system and other organs is still quite recent.
Collapse
Affiliation(s)
- Daniella Castro Araújo
- Huna, São Paulo, SP, Brazil; Departamento de Ciência da Computação, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
| | - Adriano Alonso Veloso
- Departamento de Ciência da Computação, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | | |
Collapse
|
18
|
Matysek A, Studnicka A, Smith WM, Hutny M, Gajewski P, Filipiak KJ, Goh J, Yang G. Influence of Co-morbidities During SARS-CoV-2 Infection in an Indian Population. Front Med (Lausanne) 2022; 9:962101. [PMID: 35979209 PMCID: PMC9377050 DOI: 10.3389/fmed.2022.962101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background Since the outbreak of COVID-19 pandemic the interindividual variability in the course of the disease has been reported, indicating a wide range of factors influencing it. Factors which were the most often associated with increased COVID-19 severity include higher age, obesity and diabetes. The influence of cytokine storm is complex, reflecting the complexity of the immunological processes triggered by SARS-CoV-2 infection. A modern challenge such as a worldwide pandemic requires modern solutions, which in this case is harnessing the machine learning for the purpose of analysing the differences in the clinical properties of the populations affected by the disease, followed by grading its significance, consequently leading to creation of tool applicable for assessing the individual risk of SARS-CoV-2 infection. Methods Biochemical and morphological parameters values of 5,000 patients (Curisin Healthcare (India) were gathered and used for calculation of eGFR, SII index and N/L ratio. Spearman's rank correlation coefficient formula was used for assessment of correlations between each of the features in the population and the presence of the SARS-CoV-2 infection. Feature importance was evaluated by fitting a Random Forest machine learning model to the data and examining their predictive value. Its accuracy was measured as the F1 Score. Results The parameters which showed the highest correlation coefficient were age, random serum glucose, serum urea, gender and serum cholesterol, whereas the highest inverse correlation coefficient was assessed for alanine transaminase, red blood cells count and serum creatinine. The accuracy of created model for differentiating positive from negative SARS-CoV-2 cases was 97%. Features of highest importance were age, alanine transaminase, random serum glucose and red blood cells count. Conclusion The current analysis indicates a number of parameters available for a routine screening in clinical setting. It also presents a tool created on the basis of these parameters, useful for assessing the individual risk of developing COVID-19 in patients. The limitation of the study is the demographic specificity of the studied population, which might restrict its general applicability.
Collapse
Affiliation(s)
- Adrian Matysek
- Immunidex Ltd., London, United Kingdom
- Cognescence Ltd., London, United Kingdom
| | - Aneta Studnicka
- Clinical Analysis Laboratory, Silesian Centre for Heart Diseases, Zabrze, Poland
| | - Wade Menpes Smith
- Immunidex Ltd., London, United Kingdom
- Cognescence Ltd., London, United Kingdom
| | - Michał Hutny
- Faculty of Medical Sciences in Katowice, Students’ Scientific Society, Medical University of Silesia, Katowice, Poland
| | - Paweł Gajewski
- AGH University of Science and Technology, Krakow, Poland
| | | | - Jorming Goh
- Healthy Longevity Translational Research Program, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- National University Health System (NUHS), Centre for Healthy Longevity, Singapore, Singapore
| | - Guang Yang
- Cardiovascular Research Centre, Royal Brompton Hospital, London, United Kingdom
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| |
Collapse
|
19
|
Effah CY, Miao R, Drokow EK, Agboyibor C, Qiao R, Wu Y, Miao L, Wang Y. Machine learning-assisted prediction of pneumonia based on non-invasive measures. Front Public Health 2022; 10:938801. [PMID: 35968461 PMCID: PMC9371749 DOI: 10.3389/fpubh.2022.938801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background Pneumonia is an infection of the lungs that is characterized by high morbidity and mortality. The use of machine learning systems to detect respiratory diseases via non-invasive measures such as physical and laboratory parameters is gaining momentum and has been proposed to decrease diagnostic uncertainty associated with bacterial pneumonia. Herein, this study conducted several experiments using eight machine learning models to predict pneumonia based on biomarkers, laboratory parameters, and physical features. Methods We perform machine-learning analysis on 535 different patients, each with 45 features. Data normalization to rescale all real-valued features was performed. Since it is a binary problem, we categorized each patient into one class at a time. We designed three experiments to evaluate the models: (1) feature selection techniques to select appropriate features for the models, (2) experiments on the imbalanced original dataset, and (3) experiments on the SMOTE data. We then compared eight machine learning models to evaluate their effectiveness in predicting pneumonia Results Biomarkers such as C-reactive protein and procalcitonin demonstrated the most significant discriminating power. Ensemble machine learning models such as RF (accuracy = 92.0%, precision = 91.3%, recall = 96.0%, f1-Score = 93.6%) and XGBoost (accuracy = 90.8%, precision = 92.6%, recall = 92.3%, f1-score = 92.4%) achieved the highest performance accuracy on the original dataset with AUCs of 0.96 and 0.97, respectively. On the SMOTE dataset, RF and XGBoost achieved the highest prediction results with f1-scores of 92.0 and 91.2%, respectively. Also, AUC of 0.97 was achieved for both RF and XGBoost models. Conclusions Our models showed that in the diagnosis of pneumonia, individual clinical history, laboratory indicators, and symptoms do not have adequate discriminatory power. We can also conclude that the ensemble ML models performed better in this study.
Collapse
Affiliation(s)
| | - Ruoqi Miao
- College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Emmanuel Kwateng Drokow
- Department of Radiation Oncology, Zhengzhou University People's Hospital, Henan Provincial People's Hospital, Zhengzhou, China
| | - Clement Agboyibor
- School of Pharmaceutical Sciences, Zhengzhou University, Zhengzhou, China
| | - Ruiping Qiao
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yongjun Wu
- College of Public Health, Zhengzhou University, Zhengzhou, China
- *Correspondence: Yongjun Wu
| | - Lijun Miao
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Lijun Miao
| | - Yanbin Wang
- Center of Health Management, General Hospital of Anyang Iron and Steel Group Co., Ltd, Anyang, China
- Yanbin Wang
| |
Collapse
|
20
|
Abdalrada AS, Abawajy J, Al-Quraishi T, Islam SMS. Machine learning models for prediction of co-occurrence of diabetes and cardiovascular diseases: a retrospective cohort study. J Diabetes Metab Disord 2022; 21:251-261. [PMID: 35673486 PMCID: PMC9167176 DOI: 10.1007/s40200-021-00968-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 12/29/2021] [Indexed: 12/15/2022]
Abstract
Background Diabetic mellitus (DM) and cardiovascular diseases (CVD) cause significant healthcare burden globally and often co-exists. Current approaches often fail to identify many people with co-occurrence of DM and CVD, leading to delay in healthcare seeking, increased complications and morbidity. In this paper, we aimed to develop and evaluate a two-stage machine learning (ML) model to predict the co-occurrence of DM and CVD. Methods We used the diabetes complications screening research initiative (DiScRi) dataset containing >200 variables from >2000 participants. In the first stage, we used two ML models (logistic regression and Evimp functions) implemented in multivariate adaptive regression splines model to infer the significant common risk factors for DM and CVD and applied the correlation matrix to reduce redundancy. In the second stage, we used classification and regression algorithm to develop our model. We evaluated the prediction models using prediction accuracy, sensitivity and specificity as performance metrics. Results Common risk factors for DM and CVD co-occurrence was family history of the diseases, gender, deep breathing heart rate change, lying to standing blood pressure change, HbA1c, HDL and TC\HDL ratio. The predictive model showed that the participants with HbA1c >6.45 and TC\HDL ratio > 5.5 were at risk of developing both diseases (97.9% probability). In contrast, participants with HbA1c >6.45 and TC\HDL ratio ≤ 5.5 were more likely to have only DM (84.5% probability) and those with HbA1c ≤5.45 and HDL >1.45 were likely to be healthy (82.4%. probability). Further, participants with HbA1c ≤5.45 and HDL <1.45 were at risk of only CVD (100% probability). The predictive accuracy of the ML model to detect co-occurrence of DM and CVD is 94.09%, sensitivity 93.5%, and specificity 95.8%. Conclusions Our ML model can significantly predict with high accuracy the co-occurrence of DM and CVD in people attending a screening program. This might help in early detection of patients with DM and CVD who could benefit from preventive treatment and reduce future healthcare burden.
Collapse
Affiliation(s)
- Ahmad Shaker Abdalrada
- Faculty of Computer Science and Information Technology, Wasit University, Al Kut, Iraq
- School of Information Technology, Deakin University, Melbourne, Victoria Australia
| | - Jemal Abawajy
- School of Information Technology, Deakin University, Melbourne, Victoria Australia
| | - Tahsien Al-Quraishi
- Faculty of Computer Science and Information Technology, Wasit University, Al Kut, Iraq
- School of Information Technology, Deakin University, Melbourne, Victoria Australia
| | - Sheikh Mohammed Shariful Islam
- Institute for Physical Activity and Nutrition, Deakin University, 221 Burwood Highway, Burwood, Melbourne, VIC 3125 Australia
| |
Collapse
|
21
|
Alabbad DA, Almuhaideb AM, Alsunaidi SJ, Alqudaihi KS, Alamoudi FA, Alhobaishi MK, Alaqeel NA, Alshahrani MS. Machine learning model for predicting the length of stay in the intensive care unit for Covid-19 patients in the eastern province of Saudi Arabia. INFORMATICS IN MEDICINE UNLOCKED 2022; 30:100937. [PMID: 35441086 PMCID: PMC9010025 DOI: 10.1016/j.imu.2022.100937] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 03/31/2022] [Accepted: 03/31/2022] [Indexed: 12/29/2022] Open
Abstract
The COVID-19 virus has spread rapidally throughout the world. Managing resources is one of the biggest challenges that healthcare providers around the world face during the pandemic. Allocating the Intensive Care Unit (ICU) beds' capacity is important since COVID-19 is a respiratory disease and some patients need to be admitted to the hospital with an urgent need for oxygen support, ventilation, and/or intensive medical care. In the battle against COVID-19, many governments utilized technology, especially Artificial Intelligence (AI), to contain the pandemic and limit its hazardous effects. In this paper, Machine Learning models (ML) were developed to help in detecting the COVID-19 patients’ need for the ICU and the estimated duration of their stay. Four ML algorithms were utilized: Random Forest (RF), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), and Ensemble models were trained and validated on a dataset of 895 COVID-19 patients admitted to King Fahad University hospital in the eastern province of Saudi Arabia. The conducted experiments show that the Length of Stay (LoS) in the ICU can be predicted with the highest accuracy by applying the RF model for prediction, as the achieved accuracy was 94.16%. In terms of the contributor factors to the length of stay in the ICU, correlation results showed that age, C-Reactive Protein (CRP), nasal oxygen support days are the top related factors. By searching the literature, there is no published work that used the Saudi Arabia dataset to predict the need for ICU with the number of days needed. This contribution is hoped to pave the path for hospitals and healthcare providers to manage their resources more efficiently and to help in saving lives.
Collapse
Affiliation(s)
- Dina A Alabbad
- Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Abdullah M Almuhaideb
- Department of Networks and Communications, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Shikah J Alsunaidi
- Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Kawther S Alqudaihi
- Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Fatimah A Alamoudi
- Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Maha K Alhobaishi
- Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Naimah A Alaqeel
- Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| | - Mohammed S Alshahrani
- Department of Emergency Medicine, College of Medicine, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam, 31441, Saudi Arabia
| |
Collapse
|
22
|
Abayomi-Alli OO, Damaševičius R, Maskeliūnas R, Misra S. An Ensemble Learning Model for COVID-19 Detection from Blood Test Samples. SENSORS 2022; 22:s22062224. [PMID: 35336395 PMCID: PMC8955536 DOI: 10.3390/s22062224] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/28/2022] [Accepted: 03/10/2022] [Indexed: 02/04/2023]
Abstract
Current research endeavors in the application of artificial intelligence (AI) methods in the diagnosis of the COVID-19 disease has proven indispensable with very promising results. Despite these promising results, there are still limitations in real-time detection of COVID-19 using reverse transcription polymerase chain reaction (RT-PCR) test data, such as limited datasets, imbalance classes, a high misclassification rate of models, and the need for specialized research in identifying the best features and thus improving prediction rates. This study aims to investigate and apply the ensemble learning approach to develop prediction models for effective detection of COVID-19 using routine laboratory blood test results. Hence, an ensemble machine learning-based COVID-19 detection system is presented, aiming to aid clinicians to diagnose this virus effectively. The experiment was conducted using custom convolutional neural network (CNN) models as a first-stage classifier and 15 supervised machine learning algorithms as a second-stage classifier: K-Nearest Neighbors, Support Vector Machine (Linear and RBF), Naive Bayes, Decision Tree, Random Forest, MultiLayer Perceptron, AdaBoost, ExtraTrees, Logistic Regression, Linear and Quadratic Discriminant Analysis (LDA/QDA), Passive, Ridge, and Stochastic Gradient Descent Classifier. Our findings show that an ensemble learning model based on DNN and ExtraTrees achieved a mean accuracy of 99.28% and area under curve (AUC) of 99.4%, while AdaBoost gave a mean accuracy of 99.28% and AUC of 98.8% on the San Raffaele Hospital dataset, respectively. The comparison of the proposed COVID-19 detection approach with other state-of-the-art approaches using the same dataset shows that the proposed method outperforms several other COVID-19 diagnostics methods.
Collapse
Affiliation(s)
- Olusola O. Abayomi-Alli
- Department of Software Engineering, Kaunas University of Technology, 51368 Kaunas, Lithuania;
| | - Robertas Damaševičius
- Department of Software Engineering, Kaunas University of Technology, 51368 Kaunas, Lithuania;
- Correspondence:
| | - Rytis Maskeliūnas
- Department of Multimedia Engineering, Kaunas University of Technology, 51368 Kaunas, Lithuania;
| | - Sanjay Misra
- Department of Computer Science and Communication, Ostfold University College, 3001 Halden, Norway;
| |
Collapse
|
23
|
Hatmal MM, Al-Hatamleh MAI, Olaimat AN, Mohamud R, Fawaz M, Kateeb ET, Alkhairy OK, Tayyem R, Lounis M, Al-Raeei M, Dana RK, Al-Ameer HJ, Taha MO, Bindayna KM. Reported Adverse Effects and Attitudes among Arab Populations Following COVID-19 Vaccination: A Large-Scale Multinational Study Implementing Machine Learning Tools in Predicting Post-Vaccination Adverse Effects Based on Predisposing Factors. Vaccines (Basel) 2022; 10:366. [PMID: 35334998 PMCID: PMC8955470 DOI: 10.3390/vaccines10030366] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/23/2022] [Accepted: 02/24/2022] [Indexed: 02/04/2023] Open
Abstract
Background: The unprecedented global spread of coronavirus disease 2019 (COVID-19) has imposed huge challenges on the healthcare facilities, and impacted every aspect of life. This has led to the development of several vaccines against COVID-19 within one year. This study aimed to assess the attitudes and the side effects among Arab communities after receiving a COVID-19 vaccine and use of machine learning (ML) tools to predict post-vaccination side effects based on predisposing factors. Methods: An online-based multinational survey was carried out via social media platforms from 14 June to 31 August 2021, targeting individuals who received at least one dose of a COVID-19 vaccine from 22 Arab countries. Descriptive statistics, correlation, and chi-square tests were used to analyze the data. Moreover, extensive ML tools were utilized to predict 30 post vaccination adverse effects and their severity based on 15 predisposing factors. The importance of distinct predisposing factors in predicting particular side effects was determined using global feature importance employing gradient boost as AutoML. Results: A total of 10,064 participants from 19 Arab countries were included in this study. Around 56% were female and 59% were aged from 20 to 39 years old. A high rate of vaccine hesitancy (51%) was reported among participants. Almost 88% of the participants were vaccinated with one of three COVID-19 vaccines, including Pfizer-BioNTech (52.8%), AstraZeneca (20.7%), and Sinopharm (14.2%). About 72% of participants experienced post-vaccination side effects. This study reports statistically significant associations (p < 0.01) between various predisposing factors and post-vaccinations side effects. In terms of predicting post-vaccination side effects, gradient boost, random forest, and XGBoost outperformed other ML methods. The most important predisposing factors for predicting certain side effects (i.e., tiredness, fever, headache, injection site pain and swelling, myalgia, and sleepiness and laziness) were revealed to be the number of doses, gender, type of vaccine, age, and hesitancy to receive a COVID-19 vaccine. Conclusions: The reported side effects following COVID-19 vaccination among Arab populations are usually non-life-threatening; flu-like symptoms and injection site pain. Certain predisposing factors have greater weight and importance as input data in predicting post-vaccination side effects. Based on the most significant input data, ML can also be used to predict these side effects; people with certain predicted side effects may require additional medical attention, or possibly hospitalization.
Collapse
Affiliation(s)
- Ma’mon M. Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan
| | - Mohammad A. I. Al-Hatamleh
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia; (M.A.I.A.-H.); (R.M.)
| | - Amin N. Olaimat
- Department of Clinical Nutrition and Dietetics, Faculty of Applied Medical Sciences, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan;
| | - Rohimah Mohamud
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia; (M.A.I.A.-H.); (R.M.)
| | - Mirna Fawaz
- Nursing Department, Faculty of Health Sciences, Beirut Arab University, Beirut 1105, Lebanon;
| | - Elham T. Kateeb
- Oral Health Research and Promotion Unit, Faculty of Dentistry, Al-Quds University, Jerusalem 51000, Palestine;
| | - Omar K. Alkhairy
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Ministry of National Guard Health Affairs, P.O. Box 22490, Riyadh 11426, Saudi Arabia;
- King Saud bin Abdulaziz University for Health Sciences, P.O. Box 3660, Riyadh 11481, Saudi Arabia
- King Abdullah International Medical Research Center (KAIMRC), P.O. Box 3660, Riyadh 11481, Saudi Arabia
| | - Reema Tayyem
- Department of Human Nutrition, College of Health Sciences, QU Health, Qatar University, Doha P.O. Box 2713, Qatar;
| | - Mohamed Lounis
- Department of Agro-Veterinary Science, Faculty of Natural and Life Sciences, University of Ziane Achour, BP 3117, Djelfa 17000, Algeria;
| | - Marwan Al-Raeei
- Faculty of Sciences, Damascus University, Damascus P.O. Box 30621, Syria;
| | - Rasheed K. Dana
- Faculty of Medicine, Mansoura University, Mansoura, Dakahlia 35516, Egypt;
| | - Hamzeh J. Al-Ameer
- Department of Biology and Biotechnology, Faculty of Science, American University of Madaba, P.O. Box 99, Madaba 17110, Jordan;
| | - Mutasem O. Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, The University of Jordan, Amman 11942, Jordan;
| | - Khalid M. Bindayna
- Department of Microbiology, Immunology and Infectious Diseases, College of Medicine and Medical Sciences, Arabian Gulf University, Manama 329, Bahrain
| |
Collapse
|
24
|
Yu Z, He L, Luo W, Tse R, Pau G. Deep Learning Hybrid Models for COVID-19 Prediction. JOURNAL OF GLOBAL INFORMATION MANAGEMENT 2022. [DOI: 10.4018/jgim.302890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
COVID-19 is a highly contagious virus. Blood test is one of effective method for COVID-19 diagnosis. However, the issues of blood test are time-consuming and lack of medical staffs. In this paper, four deep learning hybrid models are proposed to address these issues, i.e., CNN+GRU, CNN+Bi-RNN, CNN+Bi-LSTM, CNN+Bi-GRU. Besides, two best models CNN and CNN+LSTM from Turabieh et al. and Alakus et al. are implemented, respectively. Blood test data from Hospital Israelita Albert Einstein is used to train and test six models. The proposed best model CNN+Bi-GRU is accuracy of 0.9415, precision of 0.9417, recall of 0.9417, F1-score of 0.9417, AUC of 0.91, which outperforms the best models from Turabieh et al. and Alakus et al. Furthermore, the proposed model can help patients to get blood test results faster than traditional manual tests, and do not have errors caused by fatigue. We can envisage a wide deployment of proposed model in hospitals to alleviate the testing pressure from medical workers, especially in developing and underdeveloped countries.
Collapse
Affiliation(s)
- Ziyue Yu
- Faculty of Applied Sciences, Macao Polytechnic University, China, Macao Polytechnic University, China
| | - Lihua He
- Faculty of Applied Sciences, Macao Polytechnic University, China, Macao Polytechnic University, China
| | - Wuman Luo
- Faculty of Applied Sciences, Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence
| | - Rita Tse
- Faculty of Applied Sciences, Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence
| | - Giovanni Pau
- Department of Computer Science and Engineering, University of Bologna, Bologna, Italy, University of Bologna, Italy
| |
Collapse
|
25
|
Guest PC, Abbasifard M, Jamialahmadi T, Majeed M, Kesharwani P, Sahebkar A. Multiplex Immunoassay for Prediction of Disease Severity Associated with the Cytokine Storm in COVID-19 Cases. Methods Mol Biol 2022; 2511:245-256. [PMID: 35838965 DOI: 10.1007/978-1-0716-2395-4_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Severe cases of SARS-CoV-2 and other pathogenic virus infections are often associated with the uncontrolled release of proinflammatory cytokines, known as a "cytokine storm." We present a protocol for multiplex analysis of three cytokines, tumor necrosis factor-alpha (TNF-a), interleukin 6 (IL-6), and IL-10, which are typically elevated in cytokine storm events and may be used as a predictive biomarker profile of disease severity or disease course.
Collapse
Affiliation(s)
- Paul C Guest
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| | - Mitra Abbasifard
- Immunology of Infectious Diseases Research Center, Research Institute of Basic Medical Sciences, Rafsanjan University of Medical Sciences, Rafsanjan, Iran.
- Department of Internal Medicine, Ali-Ibn Abi-Talib Hospital, School of Medicine, Rafsanjan University of Medical Sciences, Rafsanjan, Iran.
| | - Tannaz Jamialahmadi
- Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Prashant Kesharwani
- Department of Pharmaceutics, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi, India
| | - Amirhossein Sahebkar
- Applied Biomeical Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
- School of Medicine, The University of Western Australia, Perth, Australia.
- School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
26
|
Qi X, Shen L, Chen J, Shi M, Shen B. Predicting the Disease Severity of Virus Infection. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1368:111-139. [DOI: 10.1007/978-981-16-8969-7_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
27
|
Singh V, Kamaleswaran R, Chalfin D, Buño-Soto A, San Roman J, Rojas-Kenney E, Molinaro R, von Sengbusch S, Hodjat P, Comaniciu D, Kamen A. A deep learning approach for predicting severity of COVID-19 patients using a parsimonious set of laboratory markers. iScience 2021; 24:103523. [PMID: 34870131 PMCID: PMC8626152 DOI: 10.1016/j.isci.2021.103523] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/17/2021] [Accepted: 11/23/2021] [Indexed: 12/02/2022] Open
Abstract
The SARS-CoV-2 virus has caused tremendous healthcare burden worldwide. Our focus was to develop a practical and easy-to-deploy system to predict the severe manifestation of disease in patients with COVID-19 with an aim to assist clinicians in triage and treatment decisions. Our proposed predictive algorithm is a trained artificial intelligence-based network using 8,427 COVID-19 patient records from four healthcare systems. The model provides a severity risk score along with likelihoods of various clinical outcomes, namely ventilator use and mortality. The trained model using patient age and nine laboratory markers has the prediction accuracy with an area under the curve (AUC) of 0.78, 95% CI: 0.77–0.82, and the negative predictive value NPV of 0.86, 95% CI: 0.84–0.88 for the need to use a ventilator and has an accuracy with AUC of 0.85, 95% CI: 0.84–0.86, and the NPV of 0.94, 95% CI: 0.92–0.96 for predicting in-hospital 30-day mortality. Algorithm using 9 laboratory markers & age may predict severity in patients with COVID-19 Model was trained and tested on a multicenter sample of 10,937 patients Algorithm can predict ventilator use (NPV, 0.86) and mortality (NPV, 0.94) High NPV suggests utility as an adjunct to aid in triaging of patients with COVID-19
Collapse
Affiliation(s)
- Vivek Singh
- Siemens Healthineers, Digital Technology and Innovation, 755 College Road East, Princeton, NJ 08540, USA
| | - Rishikesan Kamaleswaran
- Emory University School of Medicine WMB, 1010 Woodruff Circle, Suite 4127, Atlanta, GA 30322, USA
| | - Donald Chalfin
- Siemens Healthineers, Laboratory Diagnostics, 511 Benedict Avenue, Tarrytown, NY 10591, USA.,Jefferson College of Population Health of Thomas Jefferson University, 901 Walnut Street, Philadelphia, PA 19107, USA
| | - Antonio Buño-Soto
- Department of Laboratory Medicine, Hospital Universitario La Paz, Madrid, Spain
| | - Janika San Roman
- Siemens Healthineers, Laboratory Diagnostics, 511 Benedict Avenue, Tarrytown, NY 10591, USA
| | - Edith Rojas-Kenney
- Siemens Healthineers, Laboratory Diagnostics, 511 Benedict Avenue, Tarrytown, NY 10591, USA
| | - Ross Molinaro
- Siemens Healthineers, Laboratory Diagnostics, 511 Benedict Avenue, Tarrytown, NY 10591, USA
| | - Sabine von Sengbusch
- Siemens Healthineers, Laboratory Diagnostics, 511 Benedict Avenue, Tarrytown, NY 10591, USA
| | - Parsa Hodjat
- Department of Pathology and Genomic Medicine, Houston Methodist Hospital, 6565 Fannin Street, Houston, TX 77030, USA
| | - Dorin Comaniciu
- Siemens Healthineers, Digital Technology and Innovation, 755 College Road East, Princeton, NJ 08540, USA
| | - Ali Kamen
- Siemens Healthineers, Digital Technology and Innovation, 755 College Road East, Princeton, NJ 08540, USA
| |
Collapse
|
28
|
Jung C, Mamandipoor B, Fjølner J, Bruno R, Wernly B, Artigas A, Bollen Pinto B, Schefold JC, Wolff G, Kelm M, Beil M, Sviri S, van Heerden PV, Szczeklik W, Czuczwar M, Elhadi M, Joannidis M, Oeyen S, Zafeiridis T, Marsh B, Andersen FH, Moreno R, Cecconi M, Leaver S, De Lange DW, Guidet B, Flaatten H, Osmani V. Disease-course adapting machine learning prognostication models in critically ill elderly COVID-19 patients: a multi-centre cohort study with external validation. JMIR Med Inform 2021; 10:e32949. [PMID: 35099394 PMCID: PMC9015783 DOI: 10.2196/32949] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/22/2021] [Accepted: 12/04/2021] [Indexed: 12/12/2022] Open
Abstract
Background The COVID-19 pandemic caused by SARS-CoV-2 is challenging health care systems globally. The disease disproportionately affects the elderly population, both in terms of disease severity and mortality risk. Objective The aim of this study was to evaluate machine learning–based prognostication models for critically ill elderly COVID-19 patients, which dynamically incorporated multifaceted clinical information on evolution of the disease. Methods This multicenter cohort study (COVIP study) obtained patient data from 151 intensive care units (ICUs) from 26 countries. Different models based on the Sequential Organ Failure Assessment (SOFA) score, logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB) were derived as baseline models that included admission variables only. We subsequently included clinical events and time-to-event as additional variables to derive the final models using the same algorithms and compared their performance with that of the baseline group. Furthermore, we derived baseline and final models on a European patient cohort, which were externally validated on a non-European cohort that included Asian, African, and US patients. Results In total, 1432 elderly (≥70 years old) COVID-19–positive patients admitted to an ICU were included for analysis. Of these, 809 (56.49%) patients survived up to 30 days after admission. The average length of stay was 21.6 (SD 18.2) days. Final models that incorporated clinical events and time-to-event information provided superior performance (area under the receiver operating characteristic curve of 0.81; 95% CI 0.804-0.811), with respect to both the baseline models that used admission variables only and conventional ICU prediction models (SOFA score, P<.001). The average precision increased from 0.65 (95% CI 0.650-0.655) to 0.77 (95% CI 0.759-0.770). Conclusions Integrating important clinical events and time-to-event information led to a superior accuracy of 30-day mortality prediction compared with models based on the admission information and conventional ICU prediction models. This study shows that machine-learning models provide additional information and may support complex decision-making in critically ill elderly COVID-19 patients. Trial Registration ClinicalTrials.gov NCT04321265; https://clinicaltrials.gov/ct2/show/NCT04321265
Collapse
Affiliation(s)
- Christian Jung
- University Hospital Duesseldorf, Moorenstraße 5, Duesseldorf, DE
| | | | - Jesper Fjølner
- Department of Intensive Care, Aarhus University Hospital, Aarhus, Denmark, Aarhus, DK
| | | | - Bernhard Wernly
- Department of Anaesthesiology, Paracelsus Medical University, Salzburg, Austria, Salzburg, AT
| | - Antonio Artigas
- Department of Intensive Care Medicine, CIBER Enfermedades Respiratorias, Corporacion Sanitaria Universitaria Parc Tauli, Autonomous University of Barcelona, Sabadell, Spain, Sabadell, ES
| | - Bernardo Bollen Pinto
- Department of Acute Medicine, Geneva University Hospitals, Geneva, Switzerland, Geneva, CH
| | - Joerg C Schefold
- Department of Intensive Care Medicine, Inselspital, Universitätsspital, University of Bern, Bern, Switzerland, Bern, CH
| | - Georg Wolff
- University Hospital Duesseldorf, Moorenstraße 5, Duesseldorf, DE
| | - Malte Kelm
- University Hospital Duesseldorf, Moorenstraße 5, Duesseldorf, DE
| | - Michael Beil
- Department of Medical Intensive Care, Hadassah University Medical Center, Jerusalem, Israel, Jerusalem, IL
| | - Sigal Sviri
- Department of Medical Intensive Care, Hadassah University Medical Center, Jerusalem, Israel, Jerusalem, IL
| | - Peter Vernon van Heerden
- Dept. of Anesthesia, Intensive Care and Pain Medicine Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Israel, Jerusalem, IL
| | - Wojciech Szczeklik
- Center for Intensive Care and Perioperative Medicine, Jagiellonian University Medical College, Krakow, Poland, Krakow, PL
| | - Miroslaw Czuczwar
- 2nd Department of Anesthesiology and Intensive Care, Medical University of Lublin, Staszica 16, 20-081, Lublin, Poland, Lublin, PL
| | - Muhammed Elhadi
- Faculty of Medicine, University of Tripoli, Tripoli, Libya, Tripoli, LY
| | - Michael Joannidis
- Division of Intensive Care and Emergency Medicine, Department of Internal Medicine, Medical University Innsbruck, Innsbruck, Austria, Innsbruck, AT
| | - Sandra Oeyen
- Department of Intensive Care 1K12IC Ghent University Hospital, Ghent, Belgium, Ghent, BE
| | | | - Brian Marsh
- Mater Misericordiae University Hospital, Dublin, Ireland;, Dublin, IE
| | - Finn H Andersen
- Dep. Of Anaesthesia and Intensive Care, Ålesund Hospital, Ålesund, Norway. Dep. of Circulation and medical imaging, Norwegian university of Science and Technology, Trondheim, Norway, Alesund, NO
| | - Rui Moreno
- Unidade de Cuidados Intensivos Neurocríticos e Trauma. Hospital de São José, Centro Hospitalar Universitário de Lisboa Central, Faculdade de Ciências Médicas de Lisboa, Nova Médical School, Lisbon, Portugal, Lisbon, PT
| | - Maurizio Cecconi
- Department of Anaesthesia IRCCS Instituto Clínico Humanitas, Humanitas University, Milan, Italy, Milan, IT
| | - Susannah Leaver
- General Intensive care, St George´s University Hospitals NHS Foundation trust, London, United Kingdom, London, GB
| | - Dylan W De Lange
- Department of Intensive Care Medicine, University Medical Center, University Utrecht, the Netherlands, Utrecht, BE
| | - Bertrand Guidet
- Sorbonne Universités, UPMC Univ Paris 06, INSERM, UMR_S 1136, Institut Pierre Louis d'Epidémiologie et de Santé Publique, Equipe: épidémiologie hospitalière qualité et organisation des soins, F-75012, Paris, France. Assistance Publique - Hôpitaux de Paris, Paris, FR
| | - Hans Flaatten
- Department of Clinical Medicine, University of Bergen, Department of Anaestesia and Intensive Care, Haukeland University Hospital , Bergen, Norway, Bergen, NO
| | - Venet Osmani
- Fondazione Bruno Kessler Research Institute, Trento, Italy, Trento, IT
| |
Collapse
|
29
|
Ahamed KU, Islam M, Uddin A, Akhter A, Paul BK, Yousuf MA, Uddin S, Quinn JM, Moni MA. A deep learning approach using effective preprocessing techniques to detect COVID-19 from chest CT-scan and X-ray images. Comput Biol Med 2021; 139:105014. [PMID: 34781234 PMCID: PMC8566098 DOI: 10.1016/j.compbiomed.2021.105014] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/01/2021] [Accepted: 11/01/2021] [Indexed: 12/16/2022]
Abstract
Coronavirus disease-19 (COVID-19) is a severe respiratory viral disease first reported in late 2019 that has spread worldwide. Although some wealthy countries have made significant progress in detecting and containing this disease, most underdeveloped countries are still struggling to identify COVID-19 cases in large populations. With the rising number of COVID-19 cases, there are often insufficient COVID-19 diagnostic kits and related resources in such countries. However, other basic diagnostic resources often do exist, which motivated us to develop Deep Learning models to assist clinicians and radiologists to provide prompt diagnostic support to the patients. In this study, we have developed a deep learning-based COVID-19 case detection model trained with a dataset consisting of chest CT scans and X-ray images. A modified ResNet50V2 architecture was employed as deep learning architecture in the proposed model. The dataset utilized to train the model was collected from various publicly available sources and included four class labels: confirmed COVID-19, normal controls and confirmed viral and bacterial pneumonia cases. The aggregated dataset was preprocessed through a sharpening filter before feeding the dataset into the proposed model. This model attained an accuracy of 96.452% for four-class cases (COVID-19/Normal/Bacterial pneumonia/Viral pneumonia), 97.242% for three-class cases (COVID-19/Normal/Bacterial pneumonia) and 98.954% for two-class cases (COVID-19/Viral pneumonia) using chest X-ray images. The model acquired a comprehensive accuracy of 99.012% for three-class cases (COVID-19/Normal/Community-acquired pneumonia) and 99.99% for two-class cases (Normal/COVID-19) using CT-scan images of the chest. This high accuracy presents a new and potentially important resource to enable radiologists to identify and rapidly diagnose COVID-19 cases with only basic but widely available equipment.
Collapse
Affiliation(s)
- Khabir Uddin Ahamed
- Department of Computer Science and Engineering, Jagannath University, Dhaka, Bangladesh
| | - Manowarul Islam
- Department of Computer Science and Engineering, Jagannath University, Dhaka, Bangladesh,Corresponding author
| | - Ashraf Uddin
- Department of Computer Science and Engineering, Jagannath University, Dhaka, Bangladesh
| | - Arnisha Akhter
- Department of Computer Science and Engineering, Jagannath University, Dhaka, Bangladesh
| | - Bikash Kumar Paul
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Bangladesh
| | - Mohammad Abu Yousuf
- Institute of Information Technology, Jahangirnagar University, Dhaka, Bangladesh
| | - Shahadat Uddin
- Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Darlington, NSW, 2008, Australia
| | - Julian M.W. Quinn
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia,Artificial Intelligence & Digital Health Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia,Corresponding author. Artificial Intelligence & Digital Health Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia
| |
Collapse
|
30
|
Dairi A, Harrou F, Sun Y. Deep Generative Learning-Based 1-SVM Detectors for Unsupervised COVID-19 Infection Detection Using Blood Tests. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2021; 71:2500211. [PMID: 35582656 PMCID: PMC8962827 DOI: 10.1109/tim.2021.3130675] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 10/03/2021] [Accepted: 11/08/2021] [Indexed: 05/02/2023]
Abstract
A sample blood test has recently become an important tool to help identify false-positive/false-negative real-time reverse transcription polymerase chain reaction (rRT-PCR) tests. Importantly, this is mainly because it is an inexpensive and handy option to detect the potential COVID-19 patients. However, this test should be conducted by certified laboratories, expensive equipment, and trained personnel, and 3-4 h are needed to deliver results. Furthermore, it has relatively large false-negative rates around 15%-20%. Consequently, an alternative and more accessible solution, quicker and less costly, is needed. This article introduces flexible and unsupervised data-driven approaches to detect the COVID-19 infection based on blood test samples. In other words, we address the problem of COVID-19 infection detection using a blood test as an anomaly detection problem through an unsupervised deep hybrid model. Essentially, we amalgamate the features extraction capability of the variational autoencoder (VAE) and the detection sensitivity of the one-class support vector machine (1SVM) algorithm. Two sets of routine blood tests samples from the Albert Einstein Hospital, S ao Paulo, Brazil, and the San Raffaele Hospital, Milan, Italy, are used to assess the performance of the investigated deep learning models. Here, missing values have been imputed based on a random forest regressor. Compared to generative adversarial networks (GANs), deep belief network (DBN), and restricted Boltzmann machine (RBM)-based 1SVM, the traditional VAE, GAN, DBN, and RBM with softmax layer as discriminator layer, and the standalone 1SVM, the proposed VAE-based 1SVM detector offers superior discrimination performance of potential COVID-19 infections. Results also revealed that the deep learning-driven 1SVM detection approaches provide promising detection performance compared to the conventional deep learning models.
Collapse
Affiliation(s)
- Abdelkader Dairi
- Université des Sciences et de la Technologie d’Oran Mohamed-Boudiaf (USTOMB)Oran31000Algérie
- Laboratoire des Technologies de l’Environnement (LTE)Ecole Nationale Polytechnique OranOran31000Algeria
| | - Fouzi Harrou
- Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionKing Abdullah University of Science and Technology (KAUST)Thuwal23955-6900Saudi Arabia
| | - Ying Sun
- Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionKing Abdullah University of Science and Technology (KAUST)Thuwal23955-6900Saudi Arabia
| |
Collapse
|
31
|
Chowdhury UN, Faruqe MO, Mehedy M, Ahmad S, Islam MB, Shoombuatong W, Azad A, Moni MA. Effects of Bacille Calmette Guerin (BCG) vaccination during COVID-19 infection. Comput Biol Med 2021; 138:104891. [PMID: 34624759 PMCID: PMC8479467 DOI: 10.1016/j.compbiomed.2021.104891] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/21/2021] [Accepted: 09/21/2021] [Indexed: 12/16/2022]
Abstract
The coronavirus disease 2019 (COVID-19) is caused by the infection of highly contagious severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as the novel coronavirus. In most countries, the containment of this virus spread is not controlled, which is driving the pandemic towards a more difficult phase. In this study, we investigated the impact of the Bacille Calmette Guerin (BCG) vaccination on the severity and mortality of COVID-19 by performing transcriptomic analyses of SARS-CoV-2 infected and BCG vaccinated samples in peripheral blood mononuclear cells (PBMC). A set of common differentially expressed genes (DEGs) were identified and seeded into their functional enrichment analyses via Gene Ontology (GO)-based functional terms and pre-annotated molecular pathways databases, and their Protein-Protein Interaction (PPI) network analysis. We further analysed the regulatory elements, possible comorbidities and putative drug candidates for COVID-19 patients who have not been BCG-vaccinated. Differential expression analyses of both BCG-vaccinated and COVID-19 infected samples identified 62 shared DEGs indicating their discordant expression pattern in their respected conditions compared to control. Next, PPI analysis of those DEGs revealed 10 hub genes, namely ITGB2, CXCL8, CXCL1, CCR2, IFNG, CCL4, PTGS2, ADORA3, TLR5 and CD33. Functional enrichment analyses found significantly enriched pathways/GO terms including cytokine activities, lysosome, IL-17 signalling pathway, TNF-signalling pathways. Moreover, a set of identified TFs, miRNAs and potential drug molecules were further investigated to assess their biological involvements in COVID-19 and their therapeutic possibilities. Findings showed significant genetic interactions between BCG vaccination and SARS-CoV-2 infection, suggesting an interesting prospect of the BCG vaccine in relation to the COVID-19 pandemic. We hope it may potentially trigger further research on this critical phenomenon to combat COVID-19 spread.
Collapse
Affiliation(s)
- Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Md Omar Faruqe
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Md Mehedy
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Shamim Ahmad
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - M. Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - A.K.M. Azad
- Faculty of Science, Engineering & Technology, Swinburne University of Technology Sydney, Australia
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD 4072, Australia,Corresponding author
| |
Collapse
|
32
|
Phuong J, Hyland SL, Mooney SJ, Long DR, Takeda K, Vavilala MS, O’Hara K. Sociodemographic and clinical features predictive of SARS-CoV-2 test positivity across healthcare visit-types. PLoS One 2021; 16:e0258339. [PMID: 34648552 PMCID: PMC8516280 DOI: 10.1371/journal.pone.0258339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 09/25/2021] [Indexed: 12/15/2022] Open
Abstract
Background Despite increased testing efforts and the deployment of vaccines, COVID-19 cases and death toll continue to rise at record rates. Health systems routinely collect clinical and non-clinical information in electronic health records (EHR), yet little is known about how the minimal or intermediate spectra of EHR data can be leveraged to characterize patient SARS-CoV-2 pretest probability in support of interventional strategies. Methods and findings We modeled patient pretest probability for SARS-CoV-2 test positivity and determined which features were contributing to the prediction and relative to patients triaged in inpatient, outpatient, and telehealth/drive-up visit-types. Data from the University of Washington (UW) Medicine Health System, which excluded UW Medicine care providers, included patients predominately residing in the Seattle Puget Sound area, were used to develop a gradient-boosting decision tree (GBDT) model. Patients were included if they had at least one visit prior to initial SARS-CoV-2 RT-PCR testing between January 01, 2020 through August 7, 2020. Model performance assessments used area-under-the-receiver-operating-characteristic (AUROC) and area-under-the-precision-recall (AUPR) curves. Feature performance assessments used SHapley Additive exPlanations (SHAP) values. The generalized pretest probability model using all available features achieved high overall discriminative performance (AUROC, 0.82). Performance among inpatients (AUROC, 0.86) was higher than telehealth/drive-up testing (AUROC, 0.81) or outpatient testing (AUROC, 0.76). The two-week test positivity rate in patient ZIP code was the most informative feature towards test positivity across visit-types. Geographic and sociodemographic factors were more important predictors of SARS-CoV-2 positivity than individual clinical characteristics. Conclusions Recent geographic and sociodemographic factors, routinely collected in EHR though not routinely considered in clinical care, are the strongest predictors of initial SARS-CoV-2 test result. These findings were consistent across visit types, informing our understanding of individual SARS-CoV-2 risk factors with implications for deployment of testing, outreach, and population-level prevention efforts.
Collapse
Affiliation(s)
- Jimmy Phuong
- UW Medicine Research IT, University of Washington, Seattle, WA, United States of America
- * E-mail:
| | | | - Stephen J. Mooney
- Department of Epidemiology, University of Washington, Seattle, WA, United States of America
| | - Dustin R. Long
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, United States of America
| | - Kenji Takeda
- Microsoft Research Cambridge, Cambridge, United Kingdom
| | - Monica S. Vavilala
- Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, United States of America
- Department of Pediatrics, University of Washington, Seattle, WA, United States of America
| | - Kenton O’Hara
- Microsoft Research Cambridge, Cambridge, United Kingdom
| |
Collapse
|
33
|
Doyle R. Machine Learning-Based Prediction of COVID-19 Mortality With Limited Attributes to Expedite Patient Prognosis and Triage: Retrospective Observational Study. JMIRX MED 2021; 2:e29392. [PMID: 34843609 PMCID: PMC8601033 DOI: 10.2196/29392] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 08/16/2021] [Accepted: 09/14/2021] [Indexed: 12/12/2022]
Abstract
Background The onset and development of the COVID-19 pandemic have placed pressure on hospital resources and staff worldwide. The integration of more streamlined predictive modeling in prognosis and triage–related decision-making can partly ease this pressure. Objective The objective of this study is to assess the performance impact of dimensionality reduction on COVID-19 mortality prediction models, demonstrating the high impact of a limited number of features to limit the need for complex variable gathering before reaching meaningful risk labelling in clinical settings. Methods Standard machine learning classifiers were employed to predict an outcome of either death or recovery using 25 patient-level variables, spanning symptoms, comorbidities, and demographic information, from a geographically diverse sample representing 17 countries. The effects of feature reduction on the data were tested by running classifiers on a high-quality data set of 212 patients with populated entries for all 25 available features. The full data set was compared to two reduced variations with 7 features and 1 feature, respectively, extracted using univariate mutual information and chi-square testing. Classifier performance on each data set was then assessed on the basis of accuracy, sensitivity, specificity, and received operating characteristic–derived area under the curve metrics to quantify benefit or loss from reduction. Results The performance of the classifiers on the 212-patient sample resulted in strong mortality detection, with the highest performing model achieving specificity of 90.7% (95% CI 89.1%-92.3%) and sensitivity of 92.0% (95% CI 91.0%-92.9%). Dimensionality reduction provided strong benefits for performance. The baseline accuracy of a random forest classifier increased from 89.2% (95% CI 88.0%-90.4%) to 92.5% (95% CI 91.9%-93.0%) when training on 7 chi-square–extracted features and to 90.8% (95% CI 89.8%-91.7%) when training on 7 mutual information–extracted features. Reduction impact on a separate logistic classifier was mixed; however, when present, losses were marginal compared to the extent of feature reduction, altogether showing that reduction either improves performance or can reduce the variable-sourcing burden at hospital admission with little performance loss. Extreme feature reduction to a single most salient feature, often age, demonstrated large standalone explanatory power, with the best-performing model achieving an accuracy of 81.6% (95% CI 81.1%-82.1%); this demonstrates the relatively marginal improvement that additional variables bring to the tested models. Conclusions Predictive statistical models have promising performance in early prediction of death among patients with COVID-19. Strong dimensionality reduction was shown to further improve baseline performance on selected classifiers and only marginally reduce it in others, highlighting the importance of feature reduction in future model construction and the feasibility of deprioritizing large, hard-to-source, and nonessential feature sets in real world settings.
Collapse
|
34
|
The prediction and analysis of COVID-19 epidemic trend by combining LSTM and Markov method. Sci Rep 2021; 11:17421. [PMID: 34465820 PMCID: PMC8408143 DOI: 10.1038/s41598-021-97037-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 08/17/2021] [Indexed: 12/20/2022] Open
Abstract
Corona Virus Disease 2019 (COVID-19) has spread rapidly to countries all around the world from the end of 2019, which caused a great impact on global health and has had a huge impact on many countries. Since there is still no effective treatment, it is essential to making effective predictions for relevant departments to make responses and arrangements in advance. Under the limited data, the prediction error of LSTM model will increase over time, and its prone to big bias for medium- and long-term prediction. To overcome this problem, our study proposed a LSTM-Markov model, which uses Markov model to reduce the prediction error of LSTM model. Based on confirmed case data in the US, Britain, Brazil and Russia, we calculated the training errors of LSTM and constructed the probability transfer matrix of the Markov model by the errors. And finally, the prediction results were obtained by combining the output data of LSTM model with the prediction errors of Markov Model. The results show that: compared with the prediction results of the classical LSTM model, the average prediction error of LSTM-Markov is reduced by more than 75%, and the RMSE is reduced by more than 60%, the mean [Formula: see text] of LSTM-Markov is over 0.96. All those indicators demonstrate that the prediction accuracy of proposed LSTM-Markov model is higher than that of the LSTM model to reach more accurate prediction of COVID-19.
Collapse
|
35
|
Aktar S, Talukder A, Ahamad MM, Kamal AHM, Khan JR, Protikuzzaman M, Hossain N, Azad AKM, Quinn JMW, Summers MA, Liaw T, Eapen V, Moni MA. Machine Learning Approaches to Identify Patient Comorbidities and Symptoms That Increased Risk of Mortality in COVID-19. Diagnostics (Basel) 2021; 11:1383. [PMID: 34441317 PMCID: PMC8393412 DOI: 10.3390/diagnostics11081383] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 07/12/2021] [Accepted: 07/29/2021] [Indexed: 02/06/2023] Open
Abstract
Providing appropriate care for people suffering from COVID-19, the disease caused by the pandemic SARS-CoV-2 virus, is a significant global challenge. Many individuals who become infected may have pre-existing conditions that may interact with COVID-19 to increase symptom severity and mortality risk. COVID-19 patient comorbidities are likely to be informative regarding the individual risk of severe illness and mortality. Determining the degree to which comorbidities are associated with severe symptoms and mortality would thus greatly assist in COVID-19 care planning and provision. To assess this we performed a meta-analysis of published global literature, and machine learning predictive analysis using an aggregated COVID-19 global dataset. Our meta-analysis suggested that chronic obstructive pulmonary disease (COPD), cerebrovascular disease (CEVD), cardiovascular disease (CVD), type 2 diabetes, malignancy, and hypertension as most significantly associated with COVID-19 severity in the current published literature. Machine learning classification using novel aggregated cohort data similarly found COPD, CVD, CKD, type 2 diabetes, malignancy, and hypertension, as well as asthma, as the most significant features for classifying those deceased versus those who survived COVID-19. While age and gender were the most significant predictors of mortality, in terms of symptom-comorbidity combinations, it was observed that Pneumonia-Hypertension, Pneumonia-Diabetes, and Acute Respiratory Distress Syndrome (ARDS)-Hypertension showed the most significant associations with COVID-19 mortality. These results highlight the patient cohorts most likely to be at risk of COVID-19-related severe morbidity and mortality, which have implications for prioritization of hospital resources.
Collapse
Affiliation(s)
- Sakifa Aktar
- Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; (S.A.); (M.M.A.); (M.P.)
| | - Ashis Talukder
- Statistics Discipline, Khulna University, Khulna 9208, Bangladesh;
| | - Md. Martuza Ahamad
- Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; (S.A.); (M.M.A.); (M.P.)
| | - A. H. M. Kamal
- Department of Computer Science and Engineering, Jatiya Kabi Kazi Nazrul Islam University, Trishal, Mymensingh 2220, Bangladesh;
| | - Jahidur Rahman Khan
- Health Research Institute, University of Canberra, Canberra, ACT 2617, Australia;
| | - Md. Protikuzzaman
- Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; (S.A.); (M.M.A.); (M.P.)
| | - Nasif Hossain
- School of Tropical Medicine and Global Health, Nagasaki University, Nagasaki 852-8523, Japan;
| | - A. K. M. Azad
- Faculty of Science, Engineering & Technology, Swinburne University of Technology Sydney, Sydney, VIC 2150, Australia;
| | - Julian M. W. Quinn
- The Garvan Institute of Medical Research, Healthy Ageing Theme, Darlinghurst, NSW 2010, Australia; (J.M.W.Q.); (M.A.S.)
| | - Mathew A. Summers
- The Garvan Institute of Medical Research, Healthy Ageing Theme, Darlinghurst, NSW 2010, Australia; (J.M.W.Q.); (M.A.S.)
- St Vincent’s Clinical School, Faculty of Medicine, University of New South Wales, Sydney, NSW 2010, Australia
| | - Teng Liaw
- School of Health & Rehabilitation Sciences, The University of Queensland, Brisbane, QLD 4072, Australia;
| | - Valsamma Eapen
- World Health Organization (WHO) Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, University of New South Wales, Sydney, NSW 2052, Australia;
| | - Mohammad Ali Moni
- The Garvan Institute of Medical Research, Healthy Ageing Theme, Darlinghurst, NSW 2010, Australia; (J.M.W.Q.); (M.A.S.)
- School of Health & Rehabilitation Sciences, The University of Queensland, Brisbane, QLD 4072, Australia;
- World Health Organization (WHO) Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, University of New South Wales, Sydney, NSW 2052, Australia;
- School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
36
|
Short-Term Prediction of COVID-19 Cases Using Machine Learning Models. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11094266] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The first case in Bangladesh of the novel coronavirus disease (COVID-19) was reported on 8 March 2020, with the number of confirmed cases rapidly rising to over 175,000 by July 2020. In the absence of effective treatment, an essential tool of health policy is the modeling and forecasting of the progress of the pandemic. We, therefore, developed a cloud-based machine learning short-term forecasting model for Bangladesh, in which several regression-based machine learning models were applied to infected case data to estimate the number of COVID-19-infected people over the following seven days. This approach can accurately forecast the number of infected cases daily by training the prior 25 days sample data recorded on our web application. The outcomes of these efforts could aid the development and assessment of prevention strategies and identify factors that most affect the spread of COVID-19 infection in Bangladesh.
Collapse
|