1
|
Asatryan B, Bleijendaal H, Wilde AAM. Toward advanced diagnosis and management of inherited arrhythmia syndromes: Harnessing the capabilities of artificial intelligence and machine learning. Heart Rhythm 2023; 20:1399-1407. [PMID: 37442407 DOI: 10.1016/j.hrthm.2023.07.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 06/20/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023]
Abstract
The use of advanced computational technologies, such as artificial intelligence (AI), is now exerting a significant influence on various aspects of life, including health care and science. AI has garnered remarkable public notice with the release of deep learning models that can model anything from artwork to academic papers with minimal human intervention. Machine learning, a method that uses algorithms to extract information from raw data and represent it in a model, and deep learning, a method that uses multiple layers to progressively extract higher-level features from the raw input with minimal human intervention, are increasingly leveraged to tackle problems in the health sector, including utilization for clinical decision support in cardiovascular medicine. Inherited arrhythmia syndromes are a clinical domain where multiple unanswered questions remain despite unprecedented progress over the past 2 decades with the introduction of large panel genetic testing and the first steps in precision medicine. In particular, AI tools can help address gaps in clinical diagnosis by identifying individuals with concealed or transient phenotypes; enhance risk stratification by elevating recognition of underlying risk burden beyond widely recognized risk factors; improve prediction of response to therapy, and further prognostication. In this contemporary review, we provide a summary of the AI models developed to solve challenges in inherited arrhythmia syndromes and also outline gaps that can be filled with the development of intelligent AI models.
Collapse
Affiliation(s)
- Babken Asatryan
- Division of Cardiology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland.
| | - Hidde Bleijendaal
- University of Amsterdam, Heart Center; Department of Clinical and Experimental Cardiology, Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands; Department of Clinical Epidemiology, Biostatistics and Bioinformatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Arthur A M Wilde
- University of Amsterdam, Heart Center; Department of Clinical and Experimental Cardiology, Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam, The Netherlands; Department of Clinical Epidemiology, Biostatistics and Bioinformatics, University of Amsterdam, Amsterdam, The Netherlands; European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart (ERN GUARD-Heart)
| |
Collapse
|
2
|
Mohd Faizal AS, Hon WY, Thevarajah TM, Khor SM, Chang SW. A biomarker discovery of acute myocardial infarction using feature selection and machine learning. Med Biol Eng Comput 2023; 61:2527-2541. [PMID: 37199891 PMCID: PMC10191821 DOI: 10.1007/s11517-023-02841-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 04/25/2023] [Indexed: 05/19/2023]
Abstract
Acute myocardial infarction (AMI) or heart attack is a significant global health threat and one of the leading causes of death. The evolution of machine learning has greatly revamped the risk stratification and death prediction of AMI. In this study, an integrated feature selection and machine learning approach was used to identify potential biomarkers for early detection and treatment of AMI. First, feature selection was conducted and evaluated before all classification tasks with machine learning. Full classification models (using all 62 features) and reduced classification models (using various feature selection methods ranging from 5 to 30 features) were built and evaluated using six machine learning classification algorithms. The results showed that the reduced models performed generally better (mean AUPRC via random forest (RF) algorithm for recursive feature elimination (RFE) method ranges from 0.8048 to 0.8260, while for random forest importance (RFI) method, it ranges from 0.8301 to 0.8505) than the full models (mean AUPRC via RF: 0.8044). The most notable finding of this study was the identification of a five-feature model that included cardiac troponin I, HDL cholesterol, HbA1c, anion gap, and albumin, which had achieved comparable results (mean AUPRC via RF: 0.8462) as to the models that containing more features. These five features were proven by the previous studies as significant risk factors for AMI or cardiovascular disease and could be used as potential biomarkers to predict the prognosis of AMI patients. From the medical point of view, fewer features for diagnosis or prognosis could reduce the cost and time of a patient as lesser clinical and pathological tests are needed.
Collapse
Affiliation(s)
- Aizatul Shafiqah Mohd Faizal
- Bioinformatics Program, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Wei Yin Hon
- Bioinformatics Program, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - T Malathi Thevarajah
- Department of Pathology, Faculty of Medicine, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Sook Mei Khor
- Department of Chemistry, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Siow-Wee Chang
- Bioinformatics Program, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia.
| |
Collapse
|
3
|
Yang S, Cao L, Zhou Y, Hu C. A Retrospective Cohort Study: Predicting 90-Day Mortality for ICU Trauma Patients with a Machine Learning Algorithm Using XGBoost Using MIMIC-III Database. J Multidiscip Healthc 2023; 16:2625-2640. [PMID: 37701177 PMCID: PMC10493110 DOI: 10.2147/jmdh.s416943] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 08/29/2023] [Indexed: 09/14/2023] Open
Abstract
Objective The aim of this study was to develop and validate a machine learning-based predictive model that predicts 90-day mortality in ICU trauma patients. Methods Data of patients with severe trauma were extracted from the Medical Information Mart for Intensive Care III (MIMIC-III) database. The performances of mortality prediction models generated using nine machine learning extreme gradient boosting (XGBoost), logistic regression, random forest, AdaBoost, multilayer perceptron (MLP) neural networks, support vector machine (SVM), light gradient boosting machine (GBM), k nearest neighbors (KNN) and gaussian naive bayes (GNB). The performance of the model was evaluated in terms of discrimination, calibration and clinical application. Results We found that the accuracy, sensitivity, specificity, PPV, NPV and F1 score of our proposed XGBoost model were 82.8%, 79.7%, 77.6%, 51.2%, 91.5% and 0.624, respectively. Among the nine models, the XGBoost model performed best. Compared with traditional logistic regression, the calibration curves of the XGBoost model and decision curve analysis (DCA) performed well. Conclusion Our study shows that the XGBoost model outperforms other machine learning models in predicting 90-day mortality in trauma patients. It can be used to assist clinicians in the early identification of mortality risk factors and early intervention to reduce mortality.
Collapse
Affiliation(s)
- Shan Yang
- Department of Critical Care Medicine, West China Hospital of Sichuan University, Chengdu, Sichuan, 610041, People’s Republic of China
| | - Lirui Cao
- West China Hospital of Sichuan University, Chengdu, Sichuan, 610041, People’s Republic of China
| | - Yongfang Zhou
- Department of Respiratory Care, West China Hospital of Sichuan University, Chengdu, Sichuan, 610041, People’s Republic of China
| | - Chenggong Hu
- Department of Critical Care Medicine, West China Hospital of Sichuan University, Chengdu, Sichuan, 610041, People’s Republic of China
| |
Collapse
|
4
|
de Souza Timoteo AR, Pinheiro de Almeida IC, Yurchenko AA, de Miranda Henriques SR, de Souza Segundo P, Rajabi F, Nikolaev S, Petta TB. Brazilian XP-E siblings carrying a novel DDB2 variant developed early-onset melanoma: a case report. BMC Med Genomics 2023; 16:186. [PMID: 37573316 PMCID: PMC10422713 DOI: 10.1186/s12920-023-01622-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 07/31/2023] [Indexed: 08/14/2023] Open
Abstract
BACKGROUND Xeroderma pigmentosum group E (XP-E) is one of the least common forms of XP, a rare syndrome where patients are prone to develop skin cancer in exposed sunlight areas. XP-E patients are generally not diagnosed until they are adults due to the mild phenotype. CASE PRESENTATION two XP-E siblings, female, 23 years, and male, 25 years, from a Brazilian consanguineous family carrying the novel missense pathogenic variant in DDB2 gene, NM_000107.3:c.1027G > C, associated with skin cancer early-onset and severe phenotype, as nodular melanoma in the cornea and in the ear. CONCLUSION The assessment of genomic variant pathogenicity was a challenge since this family belongs to an underrepresented population in genomic databases. Given the scarcity of literature documenting XP-E cases and the challenges encountered in achieving an early diagnosis, this report emphasizes the imperative of sun protection measures in XP-E patients. Additionally, it highlights the detrimental impact of the COVID-19 pandemic on cancer diagnosis, leading to the manifestation of a severe phenotype in affected individuals.
Collapse
Affiliation(s)
- Ana Rafaela de Souza Timoteo
- Departamento de Biologia Celular e Genética, Universidade Federal do Rio Grande do Norte, Av. Senador Salgado Filho, s/n, Natal, 59078-970, RN, Brazil
| | | | - Andrey A Yurchenko
- Cancer Genomics Lab, B2M, Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, Villejuif, 94805, France
| | | | - Paulo de Souza Segundo
- Hospital Universitário Onofre Lopes, Universidade Federal do Rio Grande do Norte, Av. Nilo Peçanha, 620, 59012-300, RN, Natal, Brazil
| | - Fatemeh Rajabi
- Cancer Genomics Lab, B2M, Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, Villejuif, 94805, France
| | - Sergey Nikolaev
- Cancer Genomics Lab, B2M, Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, Villejuif, 94805, France
| | - Tirzah Braz Petta
- Departamento de Biologia Celular e Genética, Universidade Federal do Rio Grande do Norte, Av. Senador Salgado Filho, s/n, Natal, 59078-970, RN, Brazil.
- Keck School of Medicine, Department of Pathology, University of Southern California, HMR 315, 2011, Zonal Avenue, Los Angeles, CA, 90089-9092, USA.
| |
Collapse
|
5
|
Immadisetty K, Fang X, Ramon GS, Hartle CM, McCoy TP, Center RG, Mirshahi T, Delisle BP, Kekenes-Huskey PM. Prediction of Kv11.1 potassium channel PAS-domain variants trafficking via machine learning. J Mol Cell Cardiol 2023; 180:69-83. [PMID: 37187232 DOI: 10.1016/j.yjmcc.2023.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 04/28/2023] [Accepted: 05/09/2023] [Indexed: 05/17/2023]
Abstract
Congenital long QT syndrome (LQTS) is characterized by a prolonged QT-interval on an electrocardiogram (ECG). An abnormal prolongation in the QT-interval increases the risk for fatal arrhythmias. Genetic variants in several different cardiac ion channel genes, including KCNH2, are known to cause LQTS. Here, we evaluated whether structure-based molecular dynamics (MD) simulations and machine learning (ML) could improve the identification of missense variants in LQTS-linked genes. To do this, we investigated KCNH2 missense variants in the Kv11.1 channel protein shown to have wild type (WT) like or class II (trafficking-deficient) phenotypes in vitro. We focused on KCNH2 missense variants that disrupt normal Kv11.1 channel protein trafficking, as it is the most common phenotype for LQTS-associated variants. Specifically, we used computational techniques to correlate structural and dynamic changes in the Kv11.1 channel protein PAS domain (PASD) with Kv11.1 channel protein trafficking phenotypes. These simulations unveiled several molecular features, including the numbers of hydrating waters and hydrogen bonding pairs, as well as folding free energy scores, that are predictive of trafficking. We then used statistical and machine learning (ML) (Decision tree (DT), Random forest (RF), and Support vector machine (SVM)) techniques to classify variants using these simulation-derived features. Together with bioinformatics data, such as sequence conservation and folding energies, we were able to predict with reasonable accuracy (≈75%) which KCNH2 variants do not traffic normally. We conclude that structure-based simulations of KCNH2 variants localized to the Kv11.1 channel PASD led to an improvement in classification accuracy. Therefore, this approach should be considered to complement the classification of variant of unknown significance (VUS) in the Kv11.1 channel PASD.
Collapse
Affiliation(s)
| | - Xuan Fang
- Stritch School of Medicine, Loyola University Chicago, Maywood, IL, USA
| | | | | | | | | | | | | | | |
Collapse
|
6
|
Tarozzi M, Baiardi S, Sala C, Bartoletti-Stella A, Parchi P, Capellari S, Castellani G. Genomic, transcriptomic and RNA editing analysis of human MM1 and VV2 sporadic Creutzfeldt-Jakob disease. Acta Neuropathol Commun 2022; 10:181. [PMID: 36517866 PMCID: PMC9749175 DOI: 10.1186/s40478-022-01483-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 11/20/2022] [Indexed: 12/15/2022] Open
Abstract
Creutzfeldt-Jakob disease (CJD) is characterized by a broad phenotypic spectrum regarding symptoms, progression, and molecular features. Current sporadic CJD (sCJD) classification recognizes six main clinical-pathological phenotypes. This work investigates the molecular basis of the phenotypic heterogeneity of prion diseases through a multi-omics analysis of the two most common sCJD subtypes: MM1 and VV2. We performed DNA target sequencing on 118 genes on a cohort of 48 CJD patients and full exome RNA sequencing on post-mortem frontal cortex tissue on a subset of this cohort. DNA target sequencing identified multiple potential genetic contributors to the disease onset and phenotype, both in terms of coding, damaging-predicted variants, and enriched groups of SNPs in the whole cohort and the two subtypes. The results highlight a different functional impairment, with VV2 associated with higher impairment of the pathways related to dopamine secretion, regulation of calcium release and GABA signaling, showing some similarities with Parkinson's disease both on a genomic and a transcriptomic level. MM1 showed a gene expression profile with several traits shared with different neurodegenerative, without an apparent distinctive characteristic or similarities with a specific disease. In addition, integrating genomic and transcriptomic data led to the discovery of several sites of ADAR-mediated RNA editing events, confirming and expanding previous findings in animal models. On the transcriptomic level, this work represents the first application of RNA sequencing on CJD human brain samples. Here, a good clusterization of the transcriptomic profiles of the two subtypes was achieved, together with the finding of several differently impaired pathways between the two subtypes. The results add to the understanding of the molecular features associated with sporadic CJD and its most common subtypes, revealing strain-specific genetic signatures and functional similarities between VV2 and Parkinson's disease and providing preliminary evidence of RNA editing modifications in human sCJD.
Collapse
Affiliation(s)
- Martina Tarozzi
- grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40139 Bologna, Italy
| | - Simone Baiardi
- grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40139 Bologna, Italy ,grid.492077.fProgramma di Neuropatologia delle Malattie, Neurodegenerative, IRCCS Istituto delle Scienze Neurologiche di Bologna, 40139 Bologna, Italy
| | - Claudia Sala
- grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40139 Bologna, Italy
| | - Anna Bartoletti-Stella
- grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40139 Bologna, Italy
| | - Piero Parchi
- grid.492077.fProgramma di Neuropatologia delle Malattie, Neurodegenerative, IRCCS Istituto delle Scienze Neurologiche di Bologna, 40139 Bologna, Italy ,grid.6292.f0000 0004 1757 1758Department of Biomedical and Neuromotor Sciences, University of Bologna, 40139 Bologna, Italy
| | - Sabina Capellari
- grid.492077.fProgramma di Neuropatologia delle Malattie, Neurodegenerative, IRCCS Istituto delle Scienze Neurologiche di Bologna, 40139 Bologna, Italy ,grid.6292.f0000 0004 1757 1758Department of Biomedical and Neuromotor Sciences, University of Bologna, 40139 Bologna, Italy
| | - Gastone Castellani
- grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40139 Bologna, Italy
| |
Collapse
|
7
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
8
|
He J, Li J, Jiang S, Cheng W, Jiang J, Xu Y, Yang J, Zhou X, Chai C, Wu C. Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation. Front Public Health 2022; 10:967681. [PMID: 36091522 PMCID: PMC9452878 DOI: 10.3389/fpubh.2022.967681] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 08/02/2022] [Indexed: 01/25/2023] Open
Abstract
Background Continuously growing of HIV incidence among men who have sex with men (MSM), as well as the low rate of HIV testing of MSM in China, demonstrates a need for innovative strategies to improve the implementation of HIV prevention. The use of machine learning algorithms is an increasing tendency in disease diagnosis prediction. We aimed to develop and validate machine learning models in predicting HIV infection among MSM that can identify individuals at increased risk of HIV acquisition for transmission-reduction interventions. Methods We extracted data from MSM sentinel surveillance in Zhejiang province from 2018 to 2020. Univariate logistic regression was used to select significant variables in 2018-2019 data (P < 0.05). After data processing and feature selection, we divided the model development data into two groups by stratified random sampling: training data (70%) and testing data (30%). The Synthetic Minority Oversampling Technique (SMOTE) was applied to solve the problem of unbalanced data. The evaluation metrics of model performance were comprised of accuracy, precision, recall, F-measure, and the area under the receiver operating characteristic curve (AUC). Then, we explored three commonly-used machine learning algorithms to compare with logistic regression (LR), including decision tree (DT), support vector machines (SVM), and random forest (RF). Finally, the four models were validated prospectively with 2020 data from Zhejiang province. Results A total of 6,346 MSM were included in model development data, 372 of whom were diagnosed with HIV. In feature selection, 12 variables were selected as model predicting indicators. Compared with LR, the algorithms of DT, SVM, and RF improved the classification prediction performance in SMOTE-processed data, with the AUC of 0.778, 0.856, 0.887, and 0.942, respectively. RF was the best-performing algorithm (accuracy = 0.871, precision = 0.960, recall = 0.775, F-measure = 0.858, and AUC = 0.942). And the RF model still performed well on prospective validation (AUC = 0.846). Conclusion Machine learning models are substantially better than conventional LR model and RF should be considered in prediction tools of HIV infection in Chinese MSM. Further studies are needed to optimize and promote these algorithms and evaluate their impact on HIV prevention of MSM.
Collapse
Affiliation(s)
- Jiajin He
- School of Public Health, Zhejiang University School of Medicine, Hangzhou, China
| | - Jinhua Li
- School of Software Technology, Zhejiang University, Ningbo, China
| | - Siqing Jiang
- School of Public Health, Zhejiang University School of Medicine, Hangzhou, China
| | - Wei Cheng
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Jun Jiang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Yun Xu
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Jiezhe Yang
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Xin Zhou
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
| | - Chengliang Chai
- Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China,*Correspondence: Chengliang Chai
| | - Chao Wu
- School of Public Affairs, Zhejiang University, Hangzhou, China,Chao Wu
| |
Collapse
|
9
|
Effectiveness of Artificial Intelligence for Personalized Medicine in Neoplasms: A Systematic Review. BIOMED RESEARCH INTERNATIONAL 2022; 2022:7842566. [PMID: 35434134 PMCID: PMC9010213 DOI: 10.1155/2022/7842566] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 01/29/2022] [Accepted: 03/06/2022] [Indexed: 02/07/2023]
Abstract
Purpose Artificial intelligence (AI) techniques are used in precision medicine to explore novel genotypes and phenotypes data. The main aims of precision medicine include early diagnosis, screening, and personalized treatment regime for a patient based on genetic-oriented features and characteristics. The main objective of this study was to review AI techniques and their effectiveness in neoplasm precision medicine. Materials and Methods A comprehensive search was performed in Medline (through PubMed), Scopus, ISI Web of Science, IEEE Xplore, Embase, and Cochrane databases from inception to December 29, 2021, in order to identify the studies that used AI methods for cancer precision medicine and evaluate outcomes of the models. Results Sixty-three studies were included in this systematic review. The main AI approaches in 17 papers (26.9%) were linear and nonlinear categories (random forest or decision trees), and in 21 citations, rule-based systems and deep learning models were used. Notably, 62% of the articles were done in the United States and China. R package was the most frequent software, and breast and lung cancer were the most selected neoplasms in the papers. Out of 63 papers, in 34 articles, genomic data like gene expression, somatic mutation data, phenotype data, and proteomics with drug-response which is functional data was used as input in AI methods; in 16 papers' (25.3%) drug response, functional data was utilized in personalization of treatment. The maximum values of the assessment indicators such as accuracy, sensitivity, specificity, precision, recall, and area under the curve (AUC) in included studies were 0.99, 1.00, 0.96, 0.98, 0.99, and 0.9929, respectively. Conclusion The findings showed that in many cases, the use of artificial intelligence methods had effective application in personalized medicine.
Collapse
|
10
|
AmazonForest: In Silico Metaprediction of Pathogenic Variants. BIOLOGY 2022; 11:biology11040538. [PMID: 35453737 PMCID: PMC9024711 DOI: 10.3390/biology11040538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/19/2022] [Accepted: 03/02/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary ClinVar is a valuable platform that stores a large set of relevant genetic associations with complex phenotypes. However, the functional impact of a partial set of such associations remains misinterpreted, due to the presence of variants with uncertain significance or with conflicting pathogenicity interpretations. To fill this gap, we present AmazonForest: a metaprediction model based on Random Forest for pathogenicity prediction. AmazonForest was used to reclassify a set of ∼101,000 variants that were predicted as having high pathogenic probability. AmazonForest is available as a web tool with a simple web interface, and also as an R object for pathogenicity predictions. Abstract ClinVar is a web platform that stores ∼789,000 genetic associations with complex diseases. A partial set of these cataloged genetic associations has challenged clinicians and geneticists, often leading to conflicting interpretations or uncertain clinical impact significance. In this study, we addressed the (re)classification of genetic variants by AmazonForest, which is a random-forest-based pathogenicity metaprediction model that works by combining functional impact data from eight prediction tools. We evaluated the performance of representation learning algorithms such as autoencoders to propose a better strategy. All metaprediction models were trained with ClinVar data, and genetic variants were annotated with eight functional impact predictors cataloged with SnpEff/SnpSift. AmazonForest implements the best random forest model with a one hot data-encoding strategy, which shows an Area Under ROC Curve of ≥0.93. AmazonForest was employed for pathogenicity prediction of a set of ∼101,000 genetic variants of uncertain significance or conflict of interpretation. Our findings revealed ∼24,000 variants with high pathogenic probability (RFprob≥0.9). In addition, we show results for Alzheimer’s Disease as a demonstration of its application in clinical interpretation of genetic variants in complex diseases. Lastly, AmazonForest is available as a web tool and R object that can be loaded to perform pathogenicity predictions.
Collapse
|
11
|
A machine learning approach based on ACMG/AMP guidelines for genomic variant classification and prioritization. Sci Rep 2022; 12:2517. [PMID: 35169226 PMCID: PMC8847497 DOI: 10.1038/s41598-022-06547-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 01/07/2022] [Indexed: 01/19/2023] Open
Abstract
Genomic variant interpretation is a critical step of the diagnostic procedure, often supported by the application of tools that may predict the damaging impact of each variant or provide a guidelines-based classification. We propose the application of Machine Learning methodologies, in particular Penalized Logistic Regression, to support variant classification and prioritization. Our approach combines ACMG/AMP guidelines for germline variant interpretation as well as variant annotation features and provides a probabilistic score of pathogenicity, thus supporting the prioritization and classification of variants that would be interpreted as uncertain by the ACMG/AMP guidelines. We compared different approaches in terms of variant prioritization and classification on different datasets, showing that our data-driven approach is able to solve more variant of uncertain significance (VUS) cases in comparison with guidelines-based approaches and in silico prediction tools.
Collapse
|
12
|
Tarozzi M, Bartoletti-Stella A, Dall'Olio D, Matteuzzi T, Baiardi S, Parchi P, Castellani G, Capellari S. Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases. BMC Med Genomics 2022; 15:26. [PMID: 35144616 PMCID: PMC8830183 DOI: 10.1186/s12920-022-01173-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 02/02/2022] [Indexed: 11/10/2022] Open
Abstract
Background Targeted Next Generation Sequencing is a common and powerful approach used in both clinical and research settings. However, at present, a large fraction of the acquired genetic information is not used since pathogenicity cannot be assessed for most variants. Further complicating this scenario is the increasingly frequent description of a poli/oligogenic pattern of inheritance showing the contribution of multiple variants in increasing disease risk. We present an approach in which the entire genetic information provided by target sequencing is transformed into binary data on which we performed statistical, machine learning, and network analyses to extract all valuable information from the entire genetic profile. To test this approach and unbiasedly explore the presence of recurrent genetic patterns, we studied a cohort of 112 patients affected either by genetic Creutzfeldt–Jakob (CJD) disease caused by two mutations in the PRNP gene (p.E200K and p.V210I) with different penetrance or by sporadic Alzheimer disease (sAD). Results Unsupervised methods can identify functionally relevant sources of variation in the data, like haplogroups and polymorphisms that do not follow Hardy–Weinberg equilibrium, such as the NOTCH3 rs11670823 (c.3837 + 21 T > A). Supervised classifiers can recognize clinical phenotypes with high accuracy based on the mutational profile of patients. In addition, we found a similar alteration of allele frequencies compared the European population in sporadic patients and in V210I-CJD, a poorly penetrant PRNP mutation, and sAD, suggesting shared oligogenic patterns in different types of dementia. Pathway enrichment and protein–protein interaction network revealed different altered pathways between the two PRNP mutations. Conclusions We propose this workflow as a possible approach to gain deeper insights into the genetic information derived from target sequencing, to identify recurrent genetic patterns and improve the understanding of complex diseases. This work could also represent a possible starting point of a predictive tool for personalized medicine and advanced diagnostic applications. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-022-01173-4.
Collapse
Affiliation(s)
- M Tarozzi
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - A Bartoletti-Stella
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy.,IRCCS Institute of Neurological Sciences of Bologna, Bologna, Italy
| | - D Dall'Olio
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - T Matteuzzi
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - S Baiardi
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy.,IRCCS Institute of Neurological Sciences of Bologna, Bologna, Italy
| | - P Parchi
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy.,IRCCS Institute of Neurological Sciences of Bologna, Bologna, Italy
| | - G Castellani
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy.
| | - S Capellari
- IRCCS Institute of Neurological Sciences of Bologna, Bologna, Italy.,Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna, Italy
| |
Collapse
|
13
|
Nicora G, Rios M, Abu-Hanna A, Bellazzi R. Evaluating Pointwise Reliability of Machine Learning prediction. J Biomed Inform 2022; 127:103996. [PMID: 35041981 DOI: 10.1016/j.jbi.2022.103996] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 01/07/2022] [Accepted: 01/11/2022] [Indexed: 10/19/2022]
Abstract
Interest in Machine Learning applications to tackle clinical and biological problems is increasing. This is driven by promising results reported in many research papers, the increasing number of AI-based software products, and by the general interest in Artificial Intelligence to solve complex problems. It is therefore of importance to improve the quality of machine learning output and add safeguards to support their adoption. In addition to regulatory and logistical strategies, a crucial aspect is to detect when a Machine Learning model is not able to generalize to new unseen instances, which may originate from a population distant to that of the training population or from an under-represented subpopulation. As a result, the prediction of the machine learning model for these instances may be often wrong, given that the model is applied outside its "reliable" space of work, leading to a decreasing trust of the final users, such as clinicians. For this reason, when a model is deployed in practice, it would be important to advise users when the model's predictions may be unreliable, especially in high-stakes applications, including those in healthcare. Yet, reliability assessment of each machine learning prediction is still poorly addressed. Here, we review approaches that can support the identification of unreliable predictions, we harmonize the notation and terminology of relevant concepts, and we highlight and extend possible interrelationships and overlap among concepts. We then demonstrate, on simulated and real data for ICU in-hospital death prediction, a possible integrative framework for the identification of reliable and unreliable predictions. To do so, our proposed approach implements two complementary principles, namely the density principle and the local fit principle. The density principle verifies that the instance we want to evaluate is similar to the training set. The local fit principle verifies that the trained model performs well on training subsets that are more similar to the instance under evaluation. Our work can contribute to consolidating work in machine learning especially in medicine.
Collapse
Affiliation(s)
- Giovanna Nicora
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia (Italy)
| | - Miguel Rios
- Department of Medical Informatics, Amsterdam UMC, University of Amsterdam (The Netherlands)
| | - Ameen Abu-Hanna
- Department of Medical Informatics, Amsterdam UMC, University of Amsterdam (The Netherlands)
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia (Italy)
| |
Collapse
|
14
|
Ruscheinski A, Reimler AL, Ewald R, Uhrmacher AM. VPMBench: a test bench for variant prioritization methods. BMC Bioinformatics 2021; 22:543. [PMID: 34749640 PMCID: PMC8576923 DOI: 10.1186/s12859-021-04458-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 10/23/2021] [Indexed: 11/18/2022] Open
Abstract
Background Clinical diagnostics of whole-exome and whole-genome sequencing data requires geneticists to consider thousands of genetic variants for each patient. Various variant prioritization methods have been developed over the last years to aid clinicians in identifying variants that are likely disease-causing. Each time a new method is developed, its effectiveness must be evaluated and compared to other approaches based on the most recently available evaluation data. Doing so in an unbiased, systematic, and replicable manner requires significant effort. Results The open-source test bench “VPMBench” automates the evaluation of variant prioritization methods. VPMBench introduces a standardized interface for prioritization methods and provides a plugin system that makes it easy to evaluate new methods. It supports different input data formats and custom output data preparation. VPMBench exploits declaratively specified information about the methods, e.g., the variants supported by the methods. Plugins may also be provided in a technology-agnostic manner via containerization. Conclusions VPMBench significantly simplifies the evaluation of both custom and published variant prioritization methods. As we expect variant prioritization methods to become ever more critical with the advent of whole-genome sequencing in clinical diagnostics, such tool support is crucial to facilitate methodological research.
Collapse
Affiliation(s)
- Andreas Ruscheinski
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany.
| | - Anna Lena Reimler
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany
| | - Roland Ewald
- Limbus Medical Technologies GmbH, Lindenstraße 2, 18055, Rostock, Germany
| | - Adelinde M Uhrmacher
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany
| |
Collapse
|
15
|
Wang L, Guo K, He K, Zhu H. Bone morphological feature extraction for customized bone plate design. Sci Rep 2021; 11:15617. [PMID: 34341376 PMCID: PMC8329034 DOI: 10.1038/s41598-021-94924-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 07/19/2021] [Indexed: 01/01/2023] Open
Abstract
Fractures are difficult to treat because of individual differences in bone morphology and fracture types. Compared to serialized bone plates, the use of customized plates significantly improves the fracture healing process. However, designing custom plates often requires the extraction of skeletal morphology, which is a complex and time-consuming procedure. This study proposes a method for extracting bone morphological features to facilitate customized plate designs. The customized plate design involves three major steps: extracting the morphological features of the bone, representing the undersurface features of the plate, and constructing the customized plate. Among these steps, constructing the undersurface feature involves integrating a group of bone features with different anatomical morphologies into a semantic feature parameter set of the plate feature. The undersurface feature encapsulates the plate and bone features into a highly cohesive generic feature and then establishes an internal correlation between the plate and bone features. Using the femoral plate as an example, we further examined the validity and feasibility of the proposed method. The experimental results demonstrate that the proposed method improves the convenience of redesign through the intuitive editing of semantic parameters. In addition, the proposed method significantly improves the design efficiency and reduces the required design time.
Collapse
Affiliation(s)
- Lin Wang
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, 221004, People's Republic of China.
| | - Kaijin Guo
- Department of Orthopedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221006, People's Republic of China
| | - Kunjin He
- College of Internet of Things Engineering, Hohai University, Changzhou, 213022, People's Republic of China
| | - Hong Zhu
- School of Medical Information and Engineering, Xuzhou Medical University, Xuzhou, 221004, People's Republic of China
| |
Collapse
|
16
|
Dantas VM, Valle CT, de Oliveira RP, Bezerra MTAL, do Amaral CT, Brandão RAS, Cerqueira Maia JM, Petta TB. Germline Compound Heterozygous Variants Identified in the STXBP2 Gene Leading to a Familial Hemophagocytic Lymphohistiocytosis Type 5: A Case Report. Front Pediatr 2021; 9:633996. [PMID: 34249802 PMCID: PMC8264126 DOI: 10.3389/fped.2021.633996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 05/11/2021] [Indexed: 11/24/2022] Open
Abstract
Familial hemophagocytic lymphohistiocytosis (FHL) is a rare, potentially fatal autosomal-recessive immunodeficiency, and STXBP2 mutations have been associated with FHL type 5 (FHL-5). Here, we report a case of a 2-year-old boy who presented with recurrent fever, hepatosplenomegaly, pancytopenia, hyperferritinemia, and hypofibrinogenemia since 4 months of age. His genetic analysis revealed a compound heterozygosity of the STXBP2 gene with a described pathogenic mutation, c.1247-1G>C (splicing acceptor site), harbored by his father and a likely pathogenic variant of uncertain significance (VUS), c.704G>A (p.Arg235Gln), harbored by his mother. He was diagnosed as compound heterozygous for FHL-5 and was treated with the HLH-2004 protocol. Since treatment, this patient has been in remission, and he is being evaluated for a hematopoietic stem cell transplantation (HSCT).
Collapse
Affiliation(s)
- Vera Maria Dantas
- Department of Pediatrics, Pediatric Immunology Division of Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Cassandra Teixeira Valle
- Pediatric Hematology Division of Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Roberta Piccin de Oliveira
- Pediatric Allergy-Immunology Division, Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Mylena Taíse Azevedo L Bezerra
- Pediatric Infectiology Division, Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Cleia Teixeira do Amaral
- Pediatric Pneumology Division, Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Raissa Anielle S Brandão
- Pediatric Pneumology Division, Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Jussara M Cerqueira Maia
- Department of Pediatrics, Pediatric Gastroenterology Division of Onofre Lopes University Hospital, Federal University of Rio Grande do Norte, Natal, Brazil
| | - Tirzah Braz Petta
- Department of Cellular Biology and Genetics, Federal University of Rio Grande do Norte, Natal, Brazil
| |
Collapse
|