1
|
Narayanan R, DeGroat W, Mendhe D, Abdelhalim H, Ahmed Z. IntelliGenes: Interactive and user-friendly multimodal AI/ML application for biomarker discovery and predictive medicine. Biol Methods Protoc 2024; 9:bpae040. [PMID: 38884000 PMCID: PMC11176709 DOI: 10.1093/biomethods/bpae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/19/2024] [Accepted: 05/28/2024] [Indexed: 06/18/2024] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) have advanced in several areas and fields of life; however, its progress in the field of multi-omics is not matching the levels others have attained. Challenges include but are not limited to the handling and analysis of high volumes of complex multi-omics data, and the expertise needed to implement and execute AI/ML approaches. In this article, we present IntelliGenes, an interactive, customizable, cross-platform, and user-friendly AI/ML application for multi-omics data exploration to discover novel biomarkers and predict rare, common, and complex diseases. The implemented methodology is based on a nexus of conventional statistical techniques and cutting-edge ML algorithms, which outperforms single algorithms and result in enhanced accuracy. The interactive and cross-platform graphical user interface of IntelliGenes is divided into three main sections: (i) Data Manager, (ii) AI/ML Analysis, and (iii) Visualization. Data Manager supports the user in loading and customizing the input data and list of existing biomarkers. AI/ML Analysis allows the user to apply default combinations of statistical and ML algorithms, as well as customize and create new AI/ML pipelines. Visualization provides options to interpret a diverse set of produced results, including performance metrics, disease predictions, and various charts. The performance of IntelliGenes has been successfully tested at variable in-house and peer-reviewed studies, and was able to correctly classify individuals as patients and predict disease with high accuracy. It stands apart primarily in its simplicity in use for nontechnical users and its emphasis on generating interpretable visualizations. We have designed and implemented IntelliGenes in a way that a user with or without computational background can apply AI/ML approaches to discover novel biomarkers and predict diseases.
Collapse
Affiliation(s)
- Rishabh Narayanan
- Rutgers Institute for Health, Health Care Policy and Aging Research, The State University of New Jersey, New Brunswick, 08901, NJ, United States
| | - William DeGroat
- Rutgers Institute for Health, Health Care Policy and Aging Research, The State University of New Jersey, New Brunswick, 08901, NJ, United States
| | - Dinesh Mendhe
- Rutgers Institute for Health, Health Care Policy and Aging Research, The State University of New Jersey, New Brunswick, 08901, NJ, United States
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, The State University of New Jersey, New Brunswick, 08901, NJ, United States
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, The State University of New Jersey, New Brunswick, 08901, NJ, United States
- Department of Medicine, Division of Cardiovascular Disease and Hypertension, Robert Wood Johnson Medical School, New Brunswick, NJ, 08901, United States
| |
Collapse
|
2
|
Lo Barco T, Garcelon N, Neuraz A, Nabbout R. Natural history of rare diseases using natural language processing of narrative unstructured electronic health records: The example of Dravet syndrome. Epilepsia 2024; 65:350-361. [PMID: 38065926 DOI: 10.1111/epi.17855] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/07/2023] [Accepted: 12/07/2023] [Indexed: 12/31/2023]
Abstract
OBJECTIVE The increasing implementation of electronic health records allows the use of advanced text-mining methods for establishing new patient phenotypes and stratification, and for revealing outcome correlations. In this study, we aimed to explore the electronic narrative clinical reports of a cohort of patients with Dravet syndrome (DS) longitudinally followed at our center, to identify the capacity of this methodology to retrace natural history of DS during the early years. METHODS We used a document-based clinical data warehouse employing natural language processing to recognize the phenotype concepts in the narrative medical reports. We included patients with DS who have a medical report produced before the age of 2 years and a follow-up after the age of 3 years ("DS cohort," 56 individuals). We selected two control populations, a "general control cohort" (275 individuals) and a "neurological control cohort" (281 individuals), with similar characteristics in terms of gender, number of reports, and age at last report. To find concepts specifically associated with DS, we performed a phenome-wide association study using Cox regression, comparing the reports of the three cohorts. We then performed a qualitative analysis of the surviving concepts based on their median age at first appearance. RESULTS A total of 76 concepts were prevalent in the reports of children with DS. Concepts appearing during the first 2 years were mostly related with the epilepsy features at the onset of DS (convulsive and prolonged seizures triggered by fever, often requiring in-hospital care). Subsequently, concepts related to new types of seizures and to drug resistance appeared. A series of non-seizure-related concepts emerged after the age of 2-3 years, referring to the nonseizure comorbidities classically associated with DS. SIGNIFICANCE The extraction of clinical terms by narrative reports of children with DS allows outlining the known natural history of this rare disease in early childhood. This original model of "longitudinal phenotyping" could be applied to other rare and very rare conditions with poor natural history description.
Collapse
Affiliation(s)
- Tommaso Lo Barco
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
| | - Nicolas Garcelon
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Antoine Neuraz
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Rima Nabbout
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
- Translational Research for Neurological Disorders, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| |
Collapse
|
3
|
DeGroat W, Mendhe D, Bhusari A, Abdelhalim H, Zeeshan S, Ahmed Z. IntelliGenes: a novel machine learning pipeline for biomarker discovery and predictive analysis using multi-genomic profiles. Bioinformatics 2023; 39:btad755. [PMID: 38096588 PMCID: PMC10739559 DOI: 10.1093/bioinformatics/btad755] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/07/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
SUMMARY In this article, we present IntelliGenes, a novel machine learning (ML) pipeline for the multi-genomics exploration to discover biomarkers significant in disease prediction with high accuracy. IntelliGenes is based on a novel approach, which consists of nexus of conventional statistical techniques and cutting-edge ML algorithms using multi-genomic, clinical, and demographic data. IntelliGenes introduces a new metric, i.e. Intelligent Gene (I-Gene) score to measure the importance of individual biomarkers for prediction of complex traits. I-Gene scores can be utilized to generate I-Gene profiles of individuals to comprehend the intricacies of ML used in disease prediction. IntelliGenes is user-friendly, portable, and a cross-platform application, compatible with Microsoft Windows, macOS, and UNIX operating systems. IntelliGenes not only holds the potential for personalized early detection of common and rare diseases in individuals, but also opens avenues for broader research using novel ML methodologies, ultimately leading to personalized interventions and novel treatment targets. AVAILABILITY AND IMPLEMENTATION The source code of IntelliGenes is available on GitHub (https://github.com/drzeeshanahmed/intelligenes) and Code Ocean (https://codeocean.com/capsule/8638596/tree/v1).
Collapse
Affiliation(s)
- William DeGroat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Dinesh Mendhe
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Atharva Bhusari
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ 08901, United States
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
- Department of Medicine, Robert Wood Johnson Medical School, Rutgers Health, New Brunswick, NJ 08901, United States
| |
Collapse
|
4
|
Kang CC, Lee TY, Lim WF, Yeo WWY. Opportunities and challenges of 5G network technology toward precision medicine. Clin Transl Sci 2023; 16:2078-2094. [PMID: 37702288 PMCID: PMC10651640 DOI: 10.1111/cts.13640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/14/2023] Open
Abstract
Moving away from traditional "one-size-fits-all" treatment to precision-based medicine has tremendously improved disease prognosis, accuracy of diagnosis, disease progression prediction, and targeted-treatment. The current cutting-edge of 5G network technology is enabling a growing trend in precision medicine to extend its utility and value to the smart healthcare system. The 5G network technology will bring together big data, artificial intelligence, and machine learning to provide essential levels of connectivity to enable a new health ecosystem toward precision medicine. In the 5G-enabled health ecosystem, its applications involve predictive and preventative measurements which enable advances in patient personalization. This review aims to discuss the opportunities, challenges, and prospects posed to 5G network technology in moving forward to deliver personalized treatments and patient-centric care via a precision medicine approach.
Collapse
Affiliation(s)
- Chia Chao Kang
- School of Electrical Engineering and Artificial IntelligenceXiamen University MalaysiaSepangSelangorMalaysia
| | - Tze Yan Lee
- School of Liberal Arts, Science and Technology (PUScLST)Perdana UniversityKuala LumpurMalaysia
| | - Wai Feng Lim
- Sunway Medical CentreSubang JayaSelangor Darul EhsanMalaysia
| | - Wendy Wai Yeng Yeo
- School of PharmacyMonash University MalaysiaBandar SunwaySelangor Darul EhsanMalaysia
| |
Collapse
|
5
|
Patel KK, Venkatesan C, Abdelhalim H, Zeeshan S, Arima Y, Linna-Kuosmanen S, Ahmed Z. Genomic approaches to identify and investigate genes associated with atrial fibrillation and heart failure susceptibility. Hum Genomics 2023; 17:47. [PMID: 37270590 DOI: 10.1186/s40246-023-00498-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 05/31/2023] [Indexed: 06/05/2023] Open
Abstract
Atrial fibrillation (AF) and heart failure (HF) contribute to about 45% of all cardiovascular disease (CVD) deaths in the USA and around the globe. Due to the complex nature, progression, inherent genetic makeup, and heterogeneity of CVDs, personalized treatments are believed to be critical. To improve the deciphering of CVD mechanisms, we need to deeply investigate well-known and identify novel genes that are responsible for CVD development. With the advancements in sequencing technologies, genomic data have been generated at an unprecedented pace to foster translational research. Correct application of bioinformatics using genomic data holds the potential to reveal the genetic underpinnings of various health conditions. It can help in the identification of causal variants for AF, HF, and other CVDs by moving beyond the one-gene one-disease model through the integration of common and rare variant association, the expressed genome, and characterization of comorbidities and phenotypic traits derived from the clinical information. In this study, we examined and discussed variable genomic approaches investigating genes associated with AF, HF, and other CVDs. We collected, reviewed, and compared high-quality scientific literature published between 2009 and 2022 and accessible through PubMed/NCBI. While selecting relevant literature, we mainly focused on identifying genomic approaches involving the integration of genomic data; analysis of common and rare genetic variants; metadata and phenotypic details; and multi-ethnic studies including individuals from ethnic minorities, and European, Asian, and American ancestries. We found 190 genes associated with AF and 26 genes linked to HF. Seven genes had implications in both AF and HF, which are SYNPO2L, TTN, MTSS1, SCN5A, PITX2, KLHL3, and AGAP5. We listed our conclusion, which include detailed information about genes and SNPs associated with AF and HF.
Collapse
Affiliation(s)
- Kush Ketan Patel
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Cynthia Venkatesan
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, 195 Little Albany St, New Brunswick, NJ, USA
| | - Yuichiro Arima
- Developmental Cardiology Laboratory, International Research Center for Medical Sciences, Kumamoto University, 2-2-1 Honjo, Kumamoto City, Kumamoto, Japan
| | - Suvi Linna-Kuosmanen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211, Kuopio, Finland
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Zeeshan Ahmed
- Department of Genetics and Genome Sciences, UConn Health, 400 Farmington Ave, Farmington, CT, USA.
- Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA.
| |
Collapse
|
6
|
Venkat V, Abdelhalim H, DeGroat W, Zeeshan S, Ahmed Z. Investigating genes associated with heart failure, atrial fibrillation, and other cardiovascular diseases, and predicting disease using machine learning techniques for translational research and precision medicine. Genomics 2023; 115:110584. [PMID: 36813091 DOI: 10.1016/j.ygeno.2023.110584] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/06/2023] [Accepted: 02/11/2023] [Indexed: 02/22/2023]
Abstract
Cardiovascular disease (CVD) is the leading cause of mortality and loss of disability adjusted life years (DALYs) globally. CVDs like Heart Failure (HF) and Atrial Fibrillation (AF) are associated with physical effects on the heart muscles. As a result of the complex nature, progression, inherent genetic makeup, and heterogeneity of CVDs, personalized treatments are believed to be critical. Rightful application of artificial intelligence (AI) and machine learning (ML) approaches can lead to new insights into CVDs for providing better personalized treatments with predictive analysis and deep phenotyping. In this study we focused on implementing AI/ML techniques on RNA-seq driven gene-expression data to investigate genes associated with HF, AF, and other CVDs, and predict disease with high accuracy. The study involved generating RNA-seq data derived from the serum of consented CVD patients. Next, we processed the sequenced data using our RNA-seq pipeline and applied GVViZ for gene-disease data annotation and expression analysis. To achieve our research objectives, we developed a new Findable, Accessible, Intelligent, and Reproducible (FAIR) approach that includes a five-level biostatistical evaluation, primarily based on the Random Forest (RF) algorithm. During our AI/ML analysis, we have fitted, trained, and implemented our model to classify and distinguish high-risk CVD patients based on their age, gender, and race. With the successful execution of our model, we predicted the association of highly significant HF, AF, and other CVDs genes with demographic variables.
Collapse
Affiliation(s)
- Vignesh Venkat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - William DeGroat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, 195 Little Albany St, New Brunswick, NJ, USA
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA; Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA.
| |
Collapse
|
7
|
Ahmed Z, Zeeshan S, Lee D. Editorial: Artificial intelligence for personalized and predictive genomics data analysis. Front Genet 2023; 14:1162869. [PMID: 36936434 PMCID: PMC10020608 DOI: 10.3389/fgene.2023.1162869] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023] Open
Affiliation(s)
- Zeeshan Ahmed
- Rutgers Institute for Health, Healthcare Policy and Aging Research, Rutgers University, New Brunswick, NJ, United States
- Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, New Brunswick, NJ, United States
- *Correspondence: Zeeshan Ahmed,
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, United States
| | - Donghyung Lee
- Department of Statistics, Miami University, Oxford, OH, United States
| |
Collapse
|
8
|
Vadapalli S, Abdelhalim H, Zeeshan S, Ahmed Z. Artificial intelligence and machine learning approaches using gene expression and variant data for personalized medicine. Brief Bioinform 2022; 23:6590150. [PMID: 35595537 DOI: 10.1093/bib/bbac191] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/02/2022] [Accepted: 04/26/2022] [Indexed: 12/16/2022] Open
Abstract
Precision medicine uses genetic, environmental and lifestyle factors to more accurately diagnose and treat disease in specific groups of patients, and it is considered one of the most promising medical efforts of our time. The use of genetics is arguably the most data-rich and complex components of precision medicine. The grand challenge today is the successful assimilation of genetics into precision medicine that translates across different ancestries, diverse diseases and other distinct populations, which will require clever use of artificial intelligence (AI) and machine learning (ML) methods. Our goal here was to review and compare scientific objectives, methodologies, datasets, data sources, ethics and gaps of AI/ML approaches used in genomics and precision medicine. We selected high-quality literature published within the last 5 years that were indexed and available through PubMed Central. Our scope was narrowed to articles that reported application of AI/ML algorithms for statistical and predictive analyses using whole genome and/or whole exome sequencing for gene variants, and RNA-seq and microarrays for gene expression. We did not limit our search to specific diseases or data sources. Based on the scope of our review and comparative analysis criteria, we identified 32 different AI/ML approaches applied in variable genomics studies and report widely adapted AI/ML algorithms for predictive diagnostics across several diseases.
Collapse
Affiliation(s)
- Sreya Vadapalli
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, 195 Little Albany St, New Brunswick, NJ, USA
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA
| |
Collapse
|