1
|
Kaur P, Singh A, Chana I. OmicPredict: a framework for omics data prediction using ANOVA-Firefly algorithm for feature selection. Comput Methods Biomech Biomed Engin 2024; 27:1970-1983. [PMID: 37842810 DOI: 10.1080/10255842.2023.2268236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 09/12/2023] [Accepted: 09/30/2023] [Indexed: 10/17/2023]
Abstract
High-throughput technologies and machine learning (ML), when applied to a huge pool of medical data such as omics data, result in efficient analysis. Recent research aims to apply and develop ML models to predict a disease well in time using available omics datasets. The present work proposed a framework, 'OmicPredict', deploying a hybrid feature selection method and deep neural network (DNN) model to predict multiple diseases using omics data. The hybrid feature selection method is developed using the Analysis of Variance (ANOVA) technique and firefly algorithm. The OmicPredict framework is applied to three case studies, Alzheimer's disease, Breast cancer, and Coronavirus disease 2019 (COVID-19). In the case study of Alzheimer's disease, the framework predicts patients using GSE33000 and GSE44770 dataset. In the case study of Breast cancer, the framework predicts human epidermal growth factor receptor 2 (HER2) subtype status using Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset. In the case study of COVID-19, the framework performs patients' classification using GSE157103 dataset. The experimental results show that DNN model achieved an Area Under Curve (AUC) score of 0.949 for the Alzheimer's (GSE33000 and GSE44770) dataset. Furthermore, it achieved an AUC score of 0.987 and 0.989 for breast cancer (METABRIC) and COVID-19 (GSE157103) datasets, respectively, outperforming Random Forest, Naïve Bayes models, and the existing research.
Collapse
Affiliation(s)
- Parampreet Kaur
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Ashima Singh
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Inderveer Chana
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| |
Collapse
|
2
|
Morís DI, de Moura J, Marcos PJ, Míguez Rey E, Novo J, Ortega M. Efficient clinical decision-making process via AI-based multimodal data fusion: A COVID-19 case study. Heliyon 2024; 10:e38642. [PMID: 39640748 PMCID: PMC11619951 DOI: 10.1016/j.heliyon.2024.e38642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 09/26/2024] [Indexed: 12/07/2024] Open
Abstract
COVID-19 is an infectious disease that caused a global pandemic in 2020. In the critical moments of this healthcare emergencies, the medical staff needs to take important decisions in a context of limited resources that must be carefully managed. To this end, the computer-aided diagnosis methods are extremely powerful and help them to better recognize the evidences of high-risk patients. This can be done with the support of relevant information extracted from electronic health records, lab tests and imaging studies. In this work, we present a novel fully-automatic efficient method to help the clinical decision-making process in the context of COVID-19 risk estimation, using multimodal data fusion of clinical features and deep features extracted from chest X-ray images. The risk estimation is studied in two of the most relevant and critical encountered scenarios: the risk of hospitalization and mortality. This study shows which are the most important features for each scenario, the ratio of clinical and imaging features present in the top ranking and the performance of the used machine learning models. The results demonstrate a great performance by the classifiers, estimating the risk of hospitalization with an AUC-ROC of 0.8452 ± 0.0133 and the risk of death with an AUC-ROC of 0.8285 ± 0.0210, only using a subset of the original features, and highlight the significant contribution of imaging features to hospitalization risk assessment, while clinical features become more crucial for mortality risk evaluation. Furthermore, multimodal data fusion can outperform the approaches that use one data source. Despite the model's complexity, it requires fewer features, an advantage in scenarios with limited computational resources. This streamlined, fully-automated method shows promising potential to improve the clinical decision-making process and better manage medical resources, not only in the context of COVID-19, but also in other clinical scenarios.
Collapse
Affiliation(s)
- Daniel I. Morís
- Varpa Group, Biomedical Research Institute A Coruña (INIBIC), University of A Coruña, 15006, A Coruña, Spain
- Department of Computer Science and Information Technologies, University of A Coruña, 15071, A Coruña, Spain
| | - Joaquim de Moura
- Varpa Group, Biomedical Research Institute A Coruña (INIBIC), University of A Coruña, 15006, A Coruña, Spain
- Department of Computer Science and Information Technologies, University of A Coruña, 15071, A Coruña, Spain
| | - Pedro J. Marcos
- Dirección Asistencial y Servicio de Neumología, Complejo Hospitalario Universitario de A Coruña (CHUAC), Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, Sergas, 15006 A Coruña, Spain
| | - Enrique Míguez Rey
- Grupo de Investigación en Virología Clínica, Sección de Enfermedades Infecciosas, Servicio de Medicina Interna, Instituto de Investigación Biomédica de A Coruña (INIBIC), Área Sanitaria A Coruña y CEE (ASCC), SERGAS, 15006 A Coruña, Spain
| | - Jorge Novo
- Varpa Group, Biomedical Research Institute A Coruña (INIBIC), University of A Coruña, 15006, A Coruña, Spain
- Department of Computer Science and Information Technologies, University of A Coruña, 15071, A Coruña, Spain
| | - Marcos Ortega
- Varpa Group, Biomedical Research Institute A Coruña (INIBIC), University of A Coruña, 15006, A Coruña, Spain
- Department of Computer Science and Information Technologies, University of A Coruña, 15071, A Coruña, Spain
| |
Collapse
|
3
|
Singh B, Jevnikar AM, Desjardins E. Artificial Intelligence, Big Data, and Regulation of Immunity: Challenges and Opportunities. Arch Immunol Ther Exp (Warsz) 2024; 72:aite-2024-0006. [PMID: 38421272 DOI: 10.2478/aite-2024-0006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 01/30/2024] [Indexed: 03/02/2024]
Abstract
The immune system is regulated by a complex set of genetic, molecular, and cellular interactions. Rapid advances in the study of immunity and its network of interactions have been boosted by a spectrum of "omics" technologies that have generated huge amounts of data that have reached the status of big data (BD). With recent developments in artificial intelligence (AI), theoretical and clinical breakthroughs could emerge. Analyses of large data sets with AI tools will allow the formulation of new testable hypotheses open new research avenues and provide innovative strategies for regulating immunity and treating immunological diseases. This includes diagnosis and identification of rare diseases, prevention and treatment of autoimmune diseases, allergic disorders, infectious diseases, metabolomic disorders, cancer, and organ transplantation. However, ethical and regulatory challenges remain as to how these studies will be used to advance our understanding of basic immunology and how immunity might be regulated in health and disease. This will be particularly important for entities in which the complexity of interactions occurring at the same time and multiple cellular pathways have eluded conventional approaches to understanding and treatment. The analyses of BD by AI are likely to be complicated as both positive and negative outcomes of regulating immunity may have important ethical ramifications that need to be considered. We suggest there is an immediate need to develop guidelines as to how the analyses of immunological BD by AI tools should guide immune-based interventions to treat various diseases, prevent infections, and maintain health within an ethical framework.
Collapse
Affiliation(s)
- Bhagirath Singh
- Department of Microbiology and Immunology, University of Western Ontario, London, ON, Canada
- Robarts Research Institute, University of Western Ontario, London, ON, Canada
- Rotman Institute of Philosophy, University of Western Ontario, London, ON, Canada
| | - Anthony M Jevnikar
- Department of Microbiology and Immunology, University of Western Ontario, London, ON, Canada
- Department of Medicine, University of Western Ontario, London, ON, Canada
| | - Eric Desjardins
- Rotman Institute of Philosophy, University of Western Ontario, London, ON, Canada
- Department of Philosophy, University of Western Ontario, London, ON, Canada
| |
Collapse
|
4
|
Potamias G, Gkoublia P, Kanterakis A. The two-stage molecular scenery of SARS-CoV-2 infection with implications to disease severity: An in-silico quest. Front Immunol 2023; 14:1251067. [PMID: 38077337 PMCID: PMC10699200 DOI: 10.3389/fimmu.2023.1251067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 10/30/2023] [Indexed: 12/18/2023] Open
Abstract
Introduction The two-stage molecular profile of the progression of SARS-CoV-2 (SCOV2) infection is explored in terms of five key biological/clinical questions: (a) does SCOV2 exhibits a two-stage infection profile? (b) SARS-CoV-1 (SCOV1) vs. SCOV2: do they differ? (c) does and how SCOV2 differs from Influenza/INFL infection? (d) does low viral-load and (e) does COVID-19 early host response relate to the two-stage SCOV2 infection profile? We provide positive answers to the above questions by analyzing the time-series gene-expression profiles of preserved cell-lines infected with SCOV1/2 or, the gene-expression profiles of infected individuals with different viral-loads levels and different host-response phenotypes. Methods Our analytical methodology follows an in-silico quest organized around an elaborate multi-step analysis pipeline including: (a) utilization of fifteen gene-expression datasets from NCBI's gene expression omnibus/GEO repository; (b) thorough designation of SCOV1/2 and INFL progression stages and COVID-19 phenotypes; (c) identification of differentially expressed genes (DEGs) and enriched biological processes and pathways that contrast and differentiate between different infection stages and phenotypes; (d) employment of a graph-based clustering process for the induction of coherent groups of networked genes as the representative core molecular fingerprints that characterize the different SCOV2 progression stages and the different COVID-19 phenotypes. In addition, relying on a sensibly selected set of induced fingerprint genes and following a Machine Learning approach, we devised and assessed the performance of different classifier models for the differentiation of acute respiratory illness/ARI caused by SCOV2 or other infections (diagnostic classifiers), as well as for the prediction of COVID-19 disease severity (prognostic classifiers), with quite encouraging results. Results The central finding of our experiments demonstrates the down-regulation of type-I interferon genes (IFN-1), interferon induced genes (ISGs) and fundamental innate immune and defense biological processes and molecular pathways during the early SCOV2 infection stages, with the inverse to hold during the later ones. It is highlighted that upregulation of these genes and pathways early after infection may prove beneficial in preventing subsequent uncontrolled hyperinflammatory and potentially lethal events. Discussion The basic aim of our study was to utilize in an intuitive, efficient and productive way the most relevant and state-of-the-art bioinformatics methods to reveal the core molecular mechanisms which govern the progression of SCOV2 infection and the different COVID-19 phenotypes.
Collapse
Affiliation(s)
- George Potamias
- Computational Biomedicine Laboratory (CBML), Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece
| | - Polymnia Gkoublia
- Computational Biomedicine Laboratory (CBML), Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece
- Graduate Bioinformatics Program, School of Medicine, University of Crete, Heraklion, Greece
| | - Alexandros Kanterakis
- Computational Biomedicine Laboratory (CBML), Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece
| |
Collapse
|
5
|
Cao L. AI and data science for smart emergency, crisis and disaster resilience. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2023; 15:231-246. [PMID: 37035277 PMCID: PMC10041487 DOI: 10.1007/s41060-023-00393-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/09/2023] [Indexed: 04/07/2023]
Abstract
The uncertain world has seen increasing emergencies, crises and disasters (ECDs), such as the COVID-19 pandemic, hurricane Ian, global financial inflation and recession, misinformation disaster, and cyberattacks. AI for smart disaster resilience (AISDR) transforms classic reactive and scripted disaster management to digital proactive and intelligent resilience across ECD ecosystems. A systematic overview of diverse ECDs, classic ECD management, ECD data complexities, and an AISDR research landscape are presented in this article. Translational disaster AI is essential to enable smart disaster resilience.
Collapse
Affiliation(s)
- Longbing Cao
- School of Computer Science, University of Technology Sydney, Sydney, NSW 2007 Australia
| |
Collapse
|
6
|
Jeyananthan P. SARS-CoV-2 Diagnosis Using Transcriptome Data: A Machine Learning Approach. SN COMPUTER SCIENCE 2023; 4:218. [PMID: 36844504 PMCID: PMC9936926 DOI: 10.1007/s42979-023-01703-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 01/24/2023] [Indexed: 05/02/2023]
Abstract
SARS-CoV-2 pandemic is the big issue of the whole world right now. The health community is struggling to rescue the public and countries from this spread, which revives time to time with different waves. Even the vaccination seems to be not prevents this spread. Accurate identification of infected people on time is essential these days to control the spread. So far, Polymerase chain reaction (PCR) and rapid antigen tests are widely used in this identification, accepting their own drawbacks. False negative cases are the menaces in this scenario. To avoid these problems, this study uses machine learning techniques to build a classification model with higher accuracy to filter the COVID-19 cases from the non-COVID individuals. Transcriptome data of the SARS-CoV-2 patients along with the control are used in this stratification using three different feature selection algorithms and seven classification models. Differently expressed genes also studied between these two groups of people and used in this classification. Results shows that mutual information (or DEGs) along with naïve Bayes (or SVM) gives the best accuracy (0.98 ± 0.04) among these methods. Supplementary Information The online version contains supplementary material available at 10.1007/s42979-023-01703-6.
Collapse
|