1
|
Yaqoob N, Khan MA, Masood S, Albarakati HM, Hamza A, Alhayan F, Jamel L, Masood A. Prediction of Alzheimer's disease stages based on ResNet-Self-attention architecture with Bayesian optimization and best features selection. Front Comput Neurosci 2024; 18:1393849. [PMID: 38725868 PMCID: PMC11081001 DOI: 10.3389/fncom.2024.1393849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 03/28/2024] [Indexed: 05/12/2024] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative illness that impairs cognition, function, and behavior by causing irreversible damage to multiple brain areas, including the hippocampus. The suffering of the patients and their family members will be lessened with an early diagnosis of AD. The automatic diagnosis technique is widely required due to the shortage of medical experts and eases the burden of medical staff. The automatic artificial intelligence (AI)-based computerized method can help experts achieve better diagnosis accuracy and precision rates. This study proposes a new automated framework for AD stage prediction based on the ResNet-Self architecture and Fuzzy Entropy-controlled Path-Finding Algorithm (FEcPFA). A data augmentation technique has been utilized to resolve the dataset imbalance issue. In the next step, we proposed a new deep-learning model based on the self-attention module. A ResNet-50 architecture is modified and connected with a self-attention block for important information extraction. The hyperparameters were optimized using Bayesian optimization (BO) and then utilized to train the model, which was subsequently employed for feature extraction. The self-attention extracted features were optimized using the proposed FEcPFA. The best features were selected using FEcPFA and passed to the machine learning classifiers for the final classification. The experimental process utilized a publicly available MRI dataset and achieved an improved accuracy of 99.9%. The results were compared with state-of-the-art (SOTA) techniques, demonstrating the improvement of the proposed framework in terms of accuracy and time efficiency.
Collapse
Affiliation(s)
- Nabeela Yaqoob
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
| | - Muhammad Attique Khan
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
| | - Saleha Masood
- IRC for Finance and Digital Economy, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
| | - Hussain Mobarak Albarakati
- Department of Computer and Network Engineering, College of Computer and Information Systems, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Ameer Hamza
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
| | - Fatimah Alhayan
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Leila Jamel
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Anum Masood
- Department of Physics, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
2
|
Armoundas AA, Narayan SM, Arnett DK, Spector-Bagdady K, Bennett DA, Celi LA, Friedman PA, Gollob MH, Hall JL, Kwitek AE, Lett E, Menon BK, Sheehan KA, Al-Zaiti SS. Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement From the American Heart Association. Circulation 2024; 149:e1028-e1050. [PMID: 38415358 PMCID: PMC11042786 DOI: 10.1161/cir.0000000000001201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
A major focus of academia, industry, and global governmental agencies is to develop and apply artificial intelligence and other advanced analytical tools to transform health care delivery. The American Heart Association supports the creation of tools and services that would further the science and practice of precision medicine by enabling more precise approaches to cardiovascular and stroke research, prevention, and care of individuals and populations. Nevertheless, several challenges exist, and few artificial intelligence tools have been shown to improve cardiovascular and stroke care sufficiently to be widely adopted. This scientific statement outlines the current state of the art on the use of artificial intelligence algorithms and data science in the diagnosis, classification, and treatment of cardiovascular disease. It also sets out to advance this mission, focusing on how digital tools and, in particular, artificial intelligence may provide clinical and mechanistic insights, address bias in clinical studies, and facilitate education and implementation science to improve cardiovascular and stroke outcomes. Last, a key objective of this scientific statement is to further the field by identifying best practices, gaps, and challenges for interested stakeholders.
Collapse
|
3
|
Xu X, Li J, Zhu Z, Zhao L, Wang H, Song C, Chen Y, Zhao Q, Yang J, Pei Y. A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis. Bioengineering (Basel) 2024; 11:219. [PMID: 38534493 DOI: 10.3390/bioengineering11030219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/28/2024] Open
Abstract
Disease diagnosis represents a critical and arduous endeavor within the medical field. Artificial intelligence (AI) techniques, spanning from machine learning and deep learning to large model paradigms, stand poised to significantly augment physicians in rendering more evidence-based decisions, thus presenting a pioneering solution for clinical practice. Traditionally, the amalgamation of diverse medical data modalities (e.g., image, text, speech, genetic data, physiological signals) is imperative to facilitate a comprehensive disease analysis, a topic of burgeoning interest among both researchers and clinicians in recent times. Hence, there exists a pressing need to synthesize the latest strides in multi-modal data and AI technologies in the realm of medical diagnosis. In this paper, we narrow our focus to five specific disorders (Alzheimer's disease, breast cancer, depression, heart disease, epilepsy), elucidating advanced endeavors in their diagnosis and treatment through the lens of artificial intelligence. Our survey not only delineates detailed diagnostic methodologies across varying modalities but also underscores commonly utilized public datasets, the intricacies of feature engineering, prevalent classification models, and envisaged challenges for future endeavors. In essence, our research endeavors to contribute to the advancement of diagnostic methodologies, furnishing invaluable insights for clinical decision making.
Collapse
Affiliation(s)
- Xi Xu
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Jianqiang Li
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Zhichao Zhu
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Linna Zhao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Huina Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Changwei Song
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Yining Chen
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Qing Zhao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Jijiang Yang
- Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China
| | - Yan Pei
- School of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu 965-8580, Japan
| |
Collapse
|
4
|
Choi Y, Cha J, Choi S. Evaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES). BMC Bioinformatics 2024; 25:56. [PMID: 38308205 PMCID: PMC10837879 DOI: 10.1186/s12859-024-05677-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 01/26/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Genome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES). RESULTS First, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, naïve Bayes, and k-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen's Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems. CONCLUSIONS Our results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.
Collapse
Affiliation(s)
- Yongjun Choi
- Department of Applied Artificial Intelligence, College of Computing, Hanyang University, 55 Hanyang-daehak-ro, Sangnok-gu, Ansan, 15588, South Korea
| | - Junho Cha
- Department of Applied Artificial Intelligence, College of Computing, Hanyang University, 55 Hanyang-daehak-ro, Sangnok-gu, Ansan, 15588, South Korea
| | - Sungkyoung Choi
- Department of Applied Artificial Intelligence, College of Computing, Hanyang University, 55 Hanyang-daehak-ro, Sangnok-gu, Ansan, 15588, South Korea.
- Department of Mathematical Data Science, College of Science and Convergence Technology, Hanyang University, 55 Hanyang-daehak-ro, Sangnok-gu, Ansan, 15588, South Korea.
| |
Collapse
|
5
|
Alatrany AS, Khan W, Hussain A, Kolivand H, Al-Jumeily D. An explainable machine learning approach for Alzheimer's disease classification. Sci Rep 2024; 14:2637. [PMID: 38302557 PMCID: PMC10834965 DOI: 10.1038/s41598-024-51985-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 01/11/2024] [Indexed: 02/03/2024] Open
Abstract
The early diagnosis of Alzheimer's disease (AD) presents a significant challenge due to the subtle biomarker changes often overlooked. Machine learning (ML) models offer a promising tool for identifying individuals at risk of AD. However, current research tends to prioritize ML accuracy while neglecting the crucial aspect of model explainability. The diverse nature of AD data and the limited dataset size introduce additional challenges, primarily related to high dimensionality. In this study, we leveraged a dataset obtained from the National Alzheimer's Coordinating Center, comprising 169,408 records and 1024 features. After applying various steps to reduce the feature space. Notably, support vector machine (SVM) models trained on the selected features exhibited high performance when tested on an external dataset. SVM achieved a high F1 score of 98.9% for binary classification (distinguishing between NC and AD) and 90.7% for multiclass classification. Furthermore, SVM was able to predict AD progression over a 4-year period, with F1 scores reached 88% for binary task and 72.8% for multiclass task. To enhance model explainability, we employed two rule-extraction approaches: class rule mining and stable and interpretable rule set for classification model. These approaches generated human-understandable rules to assist domain experts in comprehending the key factors involved in AD development. We further validated these rules using SHAP and LIME models, underscoring the significance of factors such as MEMORY, JUDGMENT, COMMUN, and ORIENT in determining AD risk. Our experimental outcomes also shed light on the crucial role of the Clinical Dementia Rating tool in predicting AD.
Collapse
Affiliation(s)
- Abbas Saad Alatrany
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK.
- University of Information Technology and Communications, Baghdad, Iraq.
- Imam Ja'afar Al-Sadiq University, Baghdad, Iraq.
- NIHR Leicester Biomedical Research Centre, University of Leicester, Leicester, UK.
| | - Wasiq Khan
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK
| | - Abir Hussain
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK.
- Department of Electrical Engineering, University of Sharjah, Sharjah, United Arab Emirates.
| | - Hoshang Kolivand
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK
| | - Dhiya Al-Jumeily
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
6
|
Farzan R. Artificial intelligence in Immuno-genetics. Bioinformation 2024; 20:29-35. [PMID: 38352901 PMCID: PMC10859949 DOI: 10.6026/973206300200029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 02/16/2024] Open
Abstract
Rapid advancements in the field of artificial intelligence (AI) have opened up unprecedented opportunities to revolutionize various scientific domains, including immunology and genetics. Therefore, it is of interest to explore the emerging applications of AI in immunology and genetics, with the objective of enhancing our understanding of the dynamic intricacies of the immune system, disease etiology, and genetic variations. Hence, the use of AI methodologies in immunological and genetic datasets, thereby facilitating the development of innovative approaches in the realms of diagnosis, treatment, and personalized medicine is reviewed.
Collapse
Affiliation(s)
- Raed Farzan
- Department of Clinical Laboratory Sciences, College of Applied Medical Scienecs, King Saud University, Riyadh - 11433, Saudi Arabia
- Center of Excellence in Biotechnology Research, King Saud University, Riyadh - 11433, Saudi Arabia
- Medical and Molecular Genetics Research, King Saud University, Riyadh-11433, Saudi Arabia
| |
Collapse
|
7
|
Barnett EJ, Onete DG, Salekin A, Faraone SV. Genomic Machine Learning Meta-regression: Insights on Associations of Study Features With Reported Model Performance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:169-177. [PMID: 38109236 DOI: 10.1109/tcbb.2023.3343808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Many studies have been conducted with the goal of correctly predicting diagnostic status of a disorder using the combination of genomic data and machine learning. It is often hard to judge which components of a study led to better results and whether better reported results represent a true improvement or an uncorrected bias inflating performance. We extracted information about the methods used and other differentiating features in genomic machine learning models. We used these features in linear regressions predicting model performance. We tested for univariate and multivariate associations as well as interactions between features. Of the models reviewed, 46% used feature selection methods that can lead to data leakage. Across our models, the number of hyperparameter optimizations reported, data leakage due to feature selection, model type, and modeling an autoimmune disorder were significantly associated with an increase in reported model performance. We found a significant, negative interaction between data leakage and training size. Our results suggest that methods susceptible to data leakage are prevalent among genomic machine learning research, resulting in inflated reported performance. Best practice guidelines that promote the avoidance and recognition of data leakage may help the field avoid biased results.
Collapse
|
8
|
Hermes S, Cady J, Armentrout S, O’Connor J, Holdaway SC, Cruchaga C, Wingo T, Greytak EM. Epistatic Features and Machine Learning Improve Alzheimer's Disease Risk Prediction Over Polygenic Risk Scores. J Alzheimers Dis 2024; 99:1425-1440. [PMID: 38788065 PMCID: PMC11284654 DOI: 10.3233/jad-230236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Background Polygenic risk scores (PRS) are linear combinations of genetic markers weighted by effect size that are commonly used to predict disease risk. For complex heritable diseases such as late-onset Alzheimer's disease (LOAD), PRS models fail to capture much of the heritability. Additionally, PRS models are highly dependent on the population structure of the data on which effect sizes are assessed and have poor generalizability to new data. Objective The goal of this study is to construct a paragenic risk score that, in addition to single genetic marker data used in PRS, incorporates epistatic interaction features and machine learning methods to predict risk for LOAD. Methods We construct a new state-of-the-art genetic model for risk of Alzheimer's disease. Our approach innovates over PRS models in two ways: First, by directly incorporating epistatic interactions between SNP loci using an evolutionary algorithm guided by shared pathway information; and second, by estimating risk via an ensemble of non-linear machine learning models rather than a single linear model. We compare the paragenic model to several PRS models from the literature trained on the same dataset. Results The paragenic model is significantly more accurate than the PRS models under 10-fold cross-validation, obtaining an AUC of 83% and near-clinically significant matched sensitivity/specificity of 75%. It remains significantly more accurate when evaluated on an independent holdout dataset and maintains accuracy within APOE genotype strata. Conclusions Paragenic models show potential for improving disease risk prediction for complex heritable diseases such as LOAD over PRS models.
Collapse
Affiliation(s)
| | | | | | | | | | - Carlos Cruchaga
- Department of Psychiatry, Washington University, St. Louis, MO, USA
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University, St. Louis, MO, USA
| | - Thomas Wingo
- Goizueta Alzheimer’s Disease Center, Emory University School of Medicine, Atlanta, GA, USA
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | | | | |
Collapse
|
9
|
Zhang YH, Zhao P, Gao HL, Zhong ML, Li JY. Screening Targets and Therapeutic Drugs for Alzheimer's Disease Based on Deep Learning Model and Molecular Docking. J Alzheimers Dis 2024; 100:863-878. [PMID: 38995776 DOI: 10.3233/jad-231389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
Background Alzheimer's disease (AD) is a neurodegenerative disorder caused by a complex interplay of various factors. However, a satisfactory cure for AD remains elusive. Pharmacological interventions based on drug targets are considered the most cost-effective therapeutic strategy. Therefore, it is paramount to search potential drug targets and drugs for AD. Objective We aimed to provide novel targets and drugs for the treatment of AD employing transcriptomic data of AD and normal control brain tissues from a new perspective. Methods Our study combined the use of a multi-layer perceptron (MLP) with differential expression analysis, variance assessment and molecular docking to screen targets and drugs for AD. Results We identified the seven differentially expressed genes (DEGs) with the most significant variation (ANKRD39, CPLX1, FABP3, GABBR2, GNG3, PPM1E, and WDR49) in transcriptomic data from AD brain. A newly built MLP was used to confirm the association between the seven DEGs and AD, establishing these DEGs as potential drug targets. Drug databases and molecular docking results indicated that arbaclofen, baclofen, clozapine, arbaclofen placarbil, BML-259, BRD-K72883421, and YC-1 had high affinity for GABBR2, and FABP3 bound with oleic, palmitic, and stearic acids. Arbaclofen and YC-1 activated GABAB receptor through PI3K/AKT and PKA/CREB pathways, respectively, thereby promoting neuronal anti-apoptotic effect and inhibiting p-tau and Aβ formation. Conclusions This study provided a new strategy for the identification of targets and drugs for the treatment of AD using deep learning. Seven therapeutic targets and ten drugs were selected by using this method, providing new insight for AD treatment.
Collapse
Affiliation(s)
- Ya-Hong Zhang
- College of Life and Health Sciences, Northeastern University, Shenyang, China
| | - Pu Zhao
- College of Life and Health Sciences, Northeastern University, Shenyang, China
| | - Hui-Ling Gao
- College of Life and Health Sciences, Northeastern University, Shenyang, China
| | - Man-Li Zhong
- College of Life and Health Sciences, Northeastern University, Shenyang, China
| | - Jia-Yi Li
- Health Sciences Institute, China Medical University, Shenyang, China
- Department of Experimental Medical Science, Neuronal Plasticity and Repair Unit, Wallenberg Neuroscience Center, Lund University, Lund, Sweden
| |
Collapse
|
10
|
Ahuja SK, Shrimankar DD, Durge AR. A Study and Analysis of Disease Identification using Genomic Sequence Processing Models: An Empirical Review. Curr Genomics 2023; 24:207-235. [PMID: 38169652 PMCID: PMC10758128 DOI: 10.2174/0113892029269523231101051455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 10/05/2023] [Accepted: 10/05/2023] [Indexed: 01/05/2024] Open
Abstract
Human gene sequences are considered a primary source of comprehensive information about different body conditions. A wide variety of diseases including cancer, heart issues, brain issues, genetic issues, etc. can be pre-empted via efficient analysis of genomic sequences. Researchers have proposed different configurations of machine learning models for processing genomic sequences, and each of these models varies in terms of their performance & applicability characteristics. Models that use bioinspired optimizations are generally slower, but have superior incremental-performance, while models that use one-shot learning achieve higher instantaneous accuracy but cannot be scaled for larger disease-sets. Due to such variations, it is difficult for genomic system designers to identify optimum models for their application-specific & performance-specific use cases. To overcome this issue, a detailed survey of different genomic processing models in terms of their functional nuances, application-specific advantages, deployment-specific limitations, and contextual future scopes is discussed in this text. Based on this discussion, researchers will be able to identify optimal models for their functional use cases. This text also compares the reviewed models in terms of their quantitative parameter sets, which include, the accuracy of classification, delay needed to classify large-length sequences, precision levels, scalability levels, and deployment cost, which will assist readers in selecting deployment-specific models for their contextual clinical scenarios. This text also evaluates a novel Genome Processing Efficiency Rank (GPER) for each of these models, which will allow readers to identify models with higher performance and low overheads under real-time scenarios.
Collapse
Affiliation(s)
- Sony K. Ahuja
- Visvesvaraya National Institute of Technology, Computer Science and Engineering, India
| | - Deepti D. Shrimankar
- Visvesvaraya National Institute of Technology, Computer Science and Engineering, India
| | - Aditi R. Durge
- Visvesvaraya National Institute of Technology, Computer Science and Engineering, India
| |
Collapse
|
11
|
Bettencourt C, Skene N, Bandres-Ciga S, Anderson E, Winchester LM, Foote IF, Schwartzentruber J, Botia JA, Nalls M, Singleton A, Schilder BM, Humphrey J, Marzi SJ, Toomey CE, Kleifat AA, Harshfield EL, Garfield V, Sandor C, Keat S, Tamburin S, Frigerio CS, Lourida I, Ranson JM, Llewellyn DJ. Artificial intelligence for dementia genetics and omics. Alzheimers Dement 2023; 19:5905-5921. [PMID: 37606627 PMCID: PMC10841325 DOI: 10.1002/alz.13427] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 08/23/2023]
Abstract
Genetics and omics studies of Alzheimer's disease and other dementia subtypes enhance our understanding of underlying mechanisms and pathways that can be targeted. We identified key remaining challenges: First, can we enhance genetic studies to address missing heritability? Can we identify reproducible omics signatures that differentiate between dementia subtypes? Can high-dimensional omics data identify improved biomarkers? How can genetics inform our understanding of causal status of dementia risk factors? And which biological processes are altered by dementia-related genetic variation? Artificial intelligence (AI) and machine learning approaches give us powerful new tools in helping us to tackle these challenges, and we review possible solutions and examples of best practice. However, their limitations also need to be considered, as well as the need for coordinated multidisciplinary research and diverse deeply phenotyped cohorts. Ultimately AI approaches improve our ability to interrogate genetics and omics data for precision dementia medicine. HIGHLIGHTS: We have identified five key challenges in dementia genetics and omics studies. AI can enable detection of undiscovered patterns in dementia genetics and omics data. Enhanced and more diverse genetics and omics datasets are still needed. Multidisciplinary collaborative efforts using AI can boost dementia research.
Collapse
Affiliation(s)
- Conceicao Bettencourt
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- Queen Square Brain Bank for Neurological Disorders, UCL Queen Square Institute of Neurology, London, UK
| | - Nathan Skene
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Sara Bandres-Ciga
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, USA
| | - Emma Anderson
- Department of Mental Health of Older People, Division of Psychiatry, University College London, London, UK
| | | | - Isabelle F Foote
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, USA
| | - Jeremy Schwartzentruber
- Open Targets, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
- Illumina Artificial Intelligence Laboratory, Illumina Inc, Foster City, California, USA
| | - Juan A Botia
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - Mike Nalls
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, USA
- Data Tecnica International LLC, Washington, DC, USA
| | - Andrew Singleton
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, Maryland, USA
| | - Brian M Schilder
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Jack Humphrey
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Sarah J Marzi
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Christina E Toomey
- Queen Square Brain Bank for Neurological Disorders, UCL Queen Square Institute of Neurology, London, UK
- Department of Clinical and Movement Neuroscience, UCL Queen Square Institute of Neurology, London, UK
- The Francis Crick Institute, London, UK
| | - Ahmad Al Kleifat
- Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Eric L Harshfield
- Stroke Research Group, Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Victoria Garfield
- MRC Unit for Lifelong Health and Ageing, Institute of Cardiovascular Science, University College London, London, UK
| | - Cynthia Sandor
- UK Dementia Research Institute. School of Medicine, Cardiff University, Cardiff, UK
| | - Samuel Keat
- UK Dementia Research Institute. School of Medicine, Cardiff University, Cardiff, UK
| | - Stefano Tamburin
- Department of Neurosciences, Biomedicine and Movement Sciences, Neurology Section, University of Verona, Verona, Italy
| | - Carlo Sala Frigerio
- UK Dementia Research Institute, Queen Square Institute of Neurology, University College London, London, UK
| | | | | | - David J Llewellyn
- University of Exeter Medical School, Exeter, UK
- The Alan Turing Institute, London, UK
| |
Collapse
|
12
|
Bae J, Logan PE, Acri DJ, Bharthur A, Nho K, Saykin AJ, Risacher SL, Nudelman K, Polsinelli AJ, Pentchev V, Kim J, Hammers DB, Apostolova LG. A simulative deep learning model of SNP interactions on chromosome 19 for predicting Alzheimer's disease risk and rates of disease progression. Alzheimers Dement 2023; 19:5690-5699. [PMID: 37409680 PMCID: PMC10770299 DOI: 10.1002/alz.13319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 04/25/2023] [Accepted: 05/12/2023] [Indexed: 07/07/2023]
Abstract
BACKGROUND Identifying genetic patterns that contribute to Alzheimer's disease (AD) is important not only for pre-symptomatic risk assessment but also for building personalized therapeutic strategies. METHODS We implemented a novel simulative deep learning model to chromosome 19 genetic data from the Alzheimer's Disease Neuroimaging Initiative and the Imaging and Genetic Biomarkers of Alzheimer's Disease datasets. The model quantified the contribution of each single nucleotide polymorphism (SNP) and their epistatic impact on the likelihood of AD using the occlusion method. The top 35 AD-risk SNPs in chromosome 19 were identified, and their ability to predict the rate of AD progression was analyzed. RESULTS Rs561311966 (APOC1) and rs2229918 (ERCC1/CD3EAP) were recognized as the most powerful factors influencing AD risk. The top 35 chromosome 19 AD-risk SNPs were significant predictors of AD progression. DISCUSSION The model successfully estimated the contribution of AD-risk SNPs that account for AD progression at the individual level. This can help in building preventive precision medicine.
Collapse
Affiliation(s)
- Jinhyeong Bae
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Paige E. Logan
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Dominic J. Acri
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Apoorva Bharthur
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Kwangsik Nho
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Andrew J. Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Shannon L. Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Kelly Nudelman
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Angelina J. Polsinelli
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Valentin Pentchev
- Department of Information Technology, Indiana University Network Science Institute, Bloomington, IN, 47408, United States
| | - Jungsu Kim
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Dustin B. Hammers
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | - Liana G. Apostolova
- Department of Neurology, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, United States
| | | |
Collapse
|
13
|
Vivek S, Faul J, Thyagarajan B, Guan W. Explainable variational autoencoder (E-VAE) model using genome-wide SNPs to predict dementia. J Biomed Inform 2023; 148:104536. [PMID: 37926392 PMCID: PMC11106718 DOI: 10.1016/j.jbi.2023.104536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 10/30/2023] [Accepted: 11/02/2023] [Indexed: 11/07/2023]
Abstract
OBJECTIVE Alzheimer's disease (AD) and AD related dementias (ADRD) are complex multifactorial neurodegenerative diseases. The associations between genetic variants obtained from genome wide association studies (GWAS) are the most widely available and well documented variants associated with ADRD. Application of deep learning methods to analyze large scale GWAS data may be a powerful approach to elucidate the biological mechanisms in ADRD compared to penalized regression models that may lead to over-fitting. METHODS We developed a deep learning frame work explainable variational autoencoder (E-VAE) classifier model using genotype (GWAS SNPs = 5474) data from 2714 study participants in the Health and Retirement Study (HRS) to classify ADRD. We validated the generalizability of this model among 234 participants in the Religious Orders Study and Memory and Aging Project (ROSMAP). Utilizing a linear decoder approach we have extracted the weights associated with latent features for biological interpretation. RESULTS We obtained a predictive accuracy of 0.71 (95 % CI [0.59, 0.84]) with an AUC of 0.69 in the HRS test dataset and got an accuracy of 0.62 (95 % CI [0.56, 0.68]) with an AUC of 0.63 in the ROSMAP dataset. CONCLUSION This is the first study showing the generalizability of a deep learning prediction model for dementia using genetic variants in an independent cohort. The latent features identified using E-VAE can help us understand the biology of AD/ ADRD and better characterize disease status.
Collapse
Affiliation(s)
- Sithara Vivek
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, United States
| | - Jessica Faul
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, United States
| | - Bharat Thyagarajan
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, United States.
| | - Weihua Guan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis MN, United States.
| |
Collapse
|
14
|
Qiang YR, Zhang SW, Li JN, Li Y, Zhou QY. Diagnosis of Alzheimer's disease by joining dual attention CNN and MLP based on structural MRIs, clinical and genetic data. Artif Intell Med 2023; 145:102678. [PMID: 37925204 DOI: 10.1016/j.artmed.2023.102678] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 07/12/2023] [Accepted: 10/03/2023] [Indexed: 11/06/2023]
Abstract
Alzheimer's disease (AD) is an irreversible central nervous degenerative disease, while mild cognitive impairment (MCI) is a precursor state of AD. Accurate early diagnosis of AD is conducive to the prevention and early intervention treatment of AD. Although some computational methods have been developed for AD diagnosis, most employ only neuroimaging, ignoring other data (e.g., genetic, clinical) that may have potential disease information. In addition, the results of some methods lack interpretability. In this work, we proposed a novel method (called DANMLP) of joining dual attention convolutional neural network (CNN) and multilayer perceptron (MLP) for computer-aided AD diagnosis by integrating multi-modality data of the structural magnetic resonance imaging (sMRI), clinical data (i.e., demographics, neuropsychology), and APOE genetic data. Our DANMLP consists of four primary components: (1) the Patch-CNN for extracting the image characteristics from each local patch, (2) the position self-attention block for capturing the dependencies between features within a patch, (3) the channel self-attention block for capturing dependencies of inter-patch features, (4) two MLP networks for extracting the clinical features and outputting the AD classification results, respectively. Compared with other state-of-the-art methods in the 5CV test, DANMLP achieves 93% and 82.4% classification accuracy for the AD vs. MCI and MCI vs. NC tasks on the ADNI database, which is 0.2%∼15.2% and 3.4%∼26.8% higher than that of other five methods, respectively. The individualized visualization of focal areas can also help clinicians in the early diagnosis of AD. These results indicate that DANMLP can be effectively used for diagnosing AD and MCI patients.
Collapse
Affiliation(s)
- Yan-Rui Qiang
- Key Laboratory of Information Fusion Technology, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Jia-Ni Li
- Key Laboratory of Information Fusion Technology, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Yan Li
- Key Laboratory of Information Fusion Technology, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Qin-Yi Zhou
- Key Laboratory of Information Fusion Technology, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| |
Collapse
|
15
|
Jo T, Kim J, Bice P, Huynh K, Wang T, Arnold M, Meikle PJ, Giles C, Kaddurah-Daouk R, Saykin AJ, Nho K. Circular-SWAT for deep learning based diagnostic classification of Alzheimer's disease: application to metabolome data. EBioMedicine 2023; 97:104820. [PMID: 37806288 PMCID: PMC10579282 DOI: 10.1016/j.ebiom.2023.104820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/10/2023] Open
Abstract
BACKGROUND Deep learning has shown potential in various scientific domains but faces challenges when applied to complex, high-dimensional multi-omics data. Alzheimer's Disease (AD) is a neurodegenerative disorder that lacks targeted therapeutic options. This study introduces the Circular-Sliding Window Association Test (c-SWAT) to improve the classification accuracy in predicting AD using serum-based metabolomics data, specifically lipidomics. METHODS The c-SWAT methodology builds upon the existing Sliding Window Association Test (SWAT) and utilizes a three-step approach: feature correlation analysis, feature selection, and classification. Data from 997 participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) served as the basis for model training and validation. Feature correlations were analyzed using Weighted Gene Co-expression Network Analysis (WGCNA), and Convolutional Neural Networks (CNN) were employed for feature selection. Random Forest was used for the final classification. FINDINGS The application of c-SWAT resulted in a classification accuracy of up to 80.8% and an AUC of 0.808 for distinguishing AD from cognitively normal older adults. This marks a 9.4% improvement in accuracy and a 0.169 increase in AUC compared to methods without c-SWAT. These results were statistically significant, with a p-value of 1.04 × 10ˆ-4. The approach also identified key lipids associated with AD, such as Cer(d16:1/22:0) and PI(37:6). INTERPRETATION Our results indicate that c-SWAT is effective in improving classification accuracy and in identifying potential lipid biomarkers for AD. These identified lipids offer new avenues for understanding AD and warrant further investigation. FUNDING The specific funding of this article is provided in the acknowledgements section.
Collapse
Affiliation(s)
- Taeho Jo
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Indiana Alzheimer Disease Research Center, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Junpyo Kim
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Medical Research Institute, Sungkyunkwan University, School of Medicine, Seoul, South Korea
| | - Paula Bice
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Indiana Alzheimer Disease Research Center, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Kevin Huynh
- Baker Heart and Diabetes Institute, Melbourne, 3004, Victoria, Australia; Baker Department of Cardiometabolic Health, University of Melbourne, Parkville, 3010, Victoria, Australia
| | - Tingting Wang
- Baker Heart and Diabetes Institute, Melbourne, 3004, Victoria, Australia; Baker Department of Cardiometabolic Health, University of Melbourne, Parkville, 3010, Victoria, Australia
| | - Matthias Arnold
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, 27710, USA; Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, 85764, Germany
| | - Peter J Meikle
- Baker Heart and Diabetes Institute, Melbourne, 3004, Victoria, Australia; Baker Department of Cardiometabolic Health, University of Melbourne, Parkville, 3010, Victoria, Australia; Monash University, Melbourne, VIC 3800, Australia
| | - Corey Giles
- Baker Heart and Diabetes Institute, Melbourne, 3004, Victoria, Australia; Baker Department of Cardiometabolic Health, University of Melbourne, Parkville, 3010, Victoria, Australia
| | - Rima Kaddurah-Daouk
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, 27710, USA; Duke Institute of Brain Sciences, Duke University, Durham, NC, 27710, USA; Department of Medicine, Duke University, Durham, NC, 27710, USA
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Indiana Alzheimer Disease Research Center, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
| | - Kwangsik Nho
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Indiana Alzheimer Disease Research Center, Indiana University School of Medicine, Indianapolis, IN, 46202, USA; Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
| |
Collapse
|
16
|
Chaitanuwong P, Singhanetr P, Chainakul M, Arjkongharn N, Ruamviboonsuk P, Grzybowski A. Potential Ocular Biomarkers for Early Detection of Alzheimer's Disease and Their Roles in Artificial Intelligence Studies. Neurol Ther 2023; 12:1517-1532. [PMID: 37468682 PMCID: PMC10444735 DOI: 10.1007/s40120-023-00526-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 07/03/2023] [Indexed: 07/21/2023] Open
Abstract
Alzheimer's disease (AD) is the leading cause of dementia worldwide. Early detection is believed to be essential to disease management because it enables physicians to initiate treatment in patients with early-stage AD (early AD), with the possibility of stopping the disease or slowing disease progression, preserving function and ultimately reducing disease burden. The purpose of this study was to review prior research on the use of eye biomarkers and artificial intelligence (AI) for detecting AD and early AD. The PubMed database was searched to identify studies for review. Ocular biomarkers in AD research and AI research on AD were reviewed and summarized. According to numerous studies, there is a high likelihood that ocular biomarkers can be used to detect early AD: tears, corneal nerves, retina, visual function and, in particular, eye movement tracking have been identified as ocular biomarkers with the potential to detect early AD. However, there is currently no ocular biomarker that can be used to definitely detect early AD. A few studies that used AI with ocular biomarkers to detect AD reported promising results, demonstrating that using AI with ocular biomarkers through multimodal imaging could improve the accuracy of identifying AD patients. This strategy may become a screening tool for detecting early AD in older patients prior to the onset of AD symptoms.
Collapse
Affiliation(s)
- Pareena Chaitanuwong
- Ophthalmology Department, Rajavithi Hospital, Ministry of Public Health, Bangkok, Thailand
- Department of Ophthalmology, Faculty of Medicine, Rangsit University, Bangkok, Thailand
| | - Panisa Singhanetr
- Mettapracharak Eye Institute, Mettapracharak (Wat Rai Khing) Hospital, Nakhon Pathom, Thailand
| | - Methaphon Chainakul
- Ophthalmology Department, Rajavithi Hospital, Ministry of Public Health, Bangkok, Thailand
- Department of Ophthalmology, Faculty of Medicine, Rangsit University, Bangkok, Thailand
| | - Niracha Arjkongharn
- Ophthalmology Department, Rajavithi Hospital, Ministry of Public Health, Bangkok, Thailand
- Department of Ophthalmology, Faculty of Medicine, Rangsit University, Bangkok, Thailand
| | - Paisan Ruamviboonsuk
- Ophthalmology Department, Rajavithi Hospital, Ministry of Public Health, Bangkok, Thailand
- Department of Ophthalmology, Faculty of Medicine, Rangsit University, Bangkok, Thailand
| | - Andrzej Grzybowski
- Institute of Research in Ophthalmology, Foundation for Ophthalmology Development, Mickiewicza 24/3B, 60-836, Poznan, Poland.
| |
Collapse
|
17
|
Gyawali PK, Le Guen Y, Liu X, Belloy ME, Tang H, Zou J, He Z. Improving genetic risk prediction across diverse population by disentangling ancestry representations. Commun Biol 2023; 6:964. [PMID: 37736834 PMCID: PMC10517023 DOI: 10.1038/s42003-023-05352-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Accepted: 09/12/2023] [Indexed: 09/23/2023] Open
Abstract
Risk prediction models using genetic data have seen increasing traction in genomics. However, most of the polygenic risk models were developed using data from participants with similar (mostly European) ancestry. This can lead to biases in the risk predictors resulting in poor generalization when applied to minority populations and admixed individuals such as African Americans. To address this issue, largely due to the prediction models being biased by the underlying population structure, we propose a deep-learning framework that leverages data from diverse population and disentangles ancestry from the phenotype-relevant information in its representation. The ancestry disentangled representation can be used to build risk predictors that perform better across minority populations. We applied the proposed method to the analysis of Alzheimer's disease genetics. Comparing with standard linear and nonlinear risk prediction methods, the proposed method substantially improves risk prediction in minority populations, including admixed individuals, without needing self-reported ancestry information.
Collapse
Affiliation(s)
- Prashnna K Gyawali
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA.
| | - Yann Le Guen
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA
- Institut du Cerveau-Paris Brain Institute-ICM, Paris, France
| | - Xiaoxia Liu
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA
| | - Michael E Belloy
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA
| | - Hua Tang
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| | - Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA.
- Quantitative Sciences Unit, Department of Medicine (Biomedical Informatics Research), Stanford University, Stanford, CA, USA.
| |
Collapse
|
18
|
Alatrany AS, Khan W, Hussain AJ, Mustafina J, Al-Jumeily D. Transfer Learning for Classification of Alzheimer's Disease Based on Genome Wide Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2700-2711. [PMID: 37018274 DOI: 10.1109/tcbb.2022.3233869] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Alzheimer's disease (AD) is a type of brain disorder that is regarded as a degenerative disease because the corresponding symptoms aggravate with the time progression. Single nucleotide polymorphisms (SNPs) have been identified as relevant biomarkers for this condition. This study aims to identify SNPs biomarkers associated with the AD in order to perform a reliable classification of AD. In contrast to existing related works, we utilize deep transfer learning with varying experimental analysis for reliable classification of AD. For this purpose, the convolutional neural networks (CNN) are firstly trained over the genome-wide association studies (GWAS) dataset requested from the AD neuroimaging initiative. We then employ the deep transfer learning for further training of our CNN (as base model) over a different AD GWAS dataset, to extract the final set of features. The extracted features are then fed into Support Vector Machine for classification of AD. Detailed experiments are performed using multiple datasets and varying experimental configurations. The statistical outcomes indicate an accuracy of 89% which is a significant improvement when benchmarked with existing related works.
Collapse
|
19
|
Bhat JA, Feng X, Mir ZA, Raina A, Siddique KHM. Recent advances in artificial intelligence, mechanistic models, and speed breeding offer exciting opportunities for precise and accelerated genomics-assisted breeding. PHYSIOLOGIA PLANTARUM 2023; 175:e13969. [PMID: 37401892 DOI: 10.1111/ppl.13969] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 06/11/2023] [Accepted: 06/27/2023] [Indexed: 07/05/2023]
Abstract
Given the challenges of population growth and climate change, there is an urgent need to expedite the development of high-yielding stress-tolerant crop cultivars. While traditional breeding methods have been instrumental in ensuring global food security, their efficiency, precision, and labour intensiveness have become increasingly inadequate to address present and future challenges. Fortunately, recent advances in high-throughput phenomics and genomics-assisted breeding (GAB) provide a promising platform for enhancing crop cultivars with greater efficiency. However, several obstacles must be overcome to optimize the use of these techniques in crop improvement, such as the complexity of phenotypic analysis of big image data. In addition, the prevalent use of linear models in genome-wide association studies (GWAS) and genomic selection (GS) fails to capture the nonlinear interactions of complex traits, limiting their applicability for GAB and impeding crop improvement. Recent advances in artificial intelligence (AI) techniques have opened doors to nonlinear modelling approaches in crop breeding, enabling the capture of nonlinear and epistatic interactions in GWAS and GS and thus making this variation available for GAB. While statistical and software challenges persist in AI-based models, they are expected to be resolved soon. Furthermore, recent advances in speed breeding have significantly reduced the time (3-5-fold) required for conventional breeding. Thus, integrating speed breeding with AI and GAB could improve crop cultivar development within a considerably shorter timeframe while ensuring greater accuracy and efficiency. In conclusion, this integrated approach could revolutionize crop breeding paradigms and safeguard food production in the face of population growth and climate change.
Collapse
Affiliation(s)
| | - Xianzhong Feng
- Zhejiang Lab, Hangzhou, China
- Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
| | - Zahoor A Mir
- ICAR-National Bureau of Plant Genetic Resources, New Delhi, India
| | - Aamir Raina
- Department of Botany, Faculty of Life Sciences, Aligarh Muslim University, Aligarh, India
| | - Kadambot H M Siddique
- The UWA Institute of Agriculture and School of Agriculture & Environment, The University of Western Australia, Perth, Western Australia, Australia
| |
Collapse
|
20
|
Hermes S, Cady J, Armentrout S, O’Connor J, Carlson S, Cruchaga C, Wingo T, Greytak EM. Epistatic Features and Machine Learning Improve Alzheimer's Risk Prediction Over Polygenic Risk Scores. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.10.23285766. [PMID: 36798198 PMCID: PMC9934790 DOI: 10.1101/2023.02.10.23285766] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Background Polygenic risk scores (PRS) are linear combinations of genetic markers weighted by effect size that are commonly used to predict disease risk. For complex heritable diseases such as late onset Alzheimer's disease (LOAD), PRS models fail to capture much of the heritability. Additionally, PRS models are highly dependent on the population structure of data on which effect sizes are assessed, and have poor generalizability to new data. Objective The goal of this study is to construct a paragenic risk score that, in addition to single genetic marker data used in PRS, incorporates epistatic interaction features and machine learning methods to predict lifetime risk for LOAD. Methods We construct a new state-of-the-art genetic model for lifetime risk of Alzheimer's disease. Our approach innovates over PRS models in two ways: First, by directly incorporating epistatic interactions between SNP loci using an evolutionary algorithm guided by shared pathway information; and second, by estimating risk via an ensemble of machine learning models (gradient boosting machines and deep learning) instead of simple logistic regression. We compare the paragenic model to a PRS model from the literature trained on the same dataset. Results The paragenic model is significantly more accurate than the PRS model under 10-fold cross-validation, obtaining an AUC of 83% and near-clinically significant matched sensitivity/specificity of 75%, and remains significantly more accurate when evaluated on an independent holdout dataset. Additionally, the paragenic model maintains accuracy within APOE genotypes. Conclusion Paragenic models show potential for improving lifetime disease risk prediction for complex heritable diseases such as LOAD over PRS models.
Collapse
Affiliation(s)
| | - Janet Cady
- Parabon NanoLabs, Inc., Reston, Virginia, USA
| | | | | | | | - Carlos Cruchaga
- Department of Psychiatry, Washington University, St. Louis, MO, USA
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, MO, USA
| | - Thomas Wingo
- Goizueta Alzheimer’s Disease Center, Emory University School of Medicine, Atlanta, GA, USA
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | | | | |
Collapse
|
21
|
Katzenberger RJ, Ganetzky B, Wassarman DA. Lissencephaly-1 mutations enhance traumatic brain injury outcomes in Drosophila. Genetics 2023; 223:iyad008. [PMID: 36683334 PMCID: PMC9991514 DOI: 10.1093/genetics/iyad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 11/14/2022] [Accepted: 01/16/2023] [Indexed: 01/24/2023] Open
Abstract
Traumatic brain injury (TBI) outcomes vary greatly among individuals, but most of the variation remains unexplained. Using a Drosophila melanogaster TBI model and 178 genetically diverse lines from the Drosophila Genetic Reference Panel (DGRP), we investigated the role that genetic variation plays in determining TBI outcomes. Following injury at 20-27 days old, DGRP lines varied considerably in mortality within 24 h ("early mortality"). Additionally, the disparity in early mortality resulting from injury at 20-27 vs 0-7 days old differed among DGRP lines. These data support a polygenic basis for differences in TBI outcomes, where some gene variants elicit their effects by acting on aging-related processes. Our genome-wide association study of DGRP lines identified associations between single nucleotide polymorphisms in Lissencephaly-1 (Lis-1) and Patronin and early mortality following injury at 20-27 days old. Lis-1 regulates dynein, a microtubule motor required for retrograde transport of many cargoes, and Patronin protects microtubule minus ends against depolymerization. While Patronin mutants did not affect early mortality, Lis-1 compound heterozygotes (Lis-1x/Lis-1y) had increased early mortality following injury at 20-27 or 0-7 days old compared with Lis-1 heterozygotes (Lis-1x/+), and flies that survived 24 h after injury had increased neurodegeneration but an unaltered lifespan, indicating that Lis-1 affects TBI outcomes independently of effects on aging. These data suggest that Lis-1 activity is required in the brain to ameliorate TBI outcomes through effects on axonal transport, microtubule stability, and other microtubule proteins, such as tau, implicated in chronic traumatic encephalopathy, a TBI-associated neurodegenerative disease in humans.
Collapse
Affiliation(s)
- Rebeccah J Katzenberger
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Barry Ganetzky
- Department of Genetics, College of Agricultural and Life Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David A Wassarman
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
22
|
Singha M, Pu L, Stanfield BA, Uche IK, Rider PJF, Kousoulas KG, Ramanujam J, Brylinski M. Artificial intelligence to guide precision anticancer therapy with multitargeted kinase inhibitors. BMC Cancer 2022; 22:1211. [PMID: 36434556 PMCID: PMC9694576 DOI: 10.1186/s12885-022-10293-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 11/07/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Vast amounts of rapidly accumulating biological data related to cancer and a remarkable progress in the field of artificial intelligence (AI) have paved the way for precision oncology. Our recent contribution to this area of research is CancerOmicsNet, an AI-based system to predict the therapeutic effects of multitargeted kinase inhibitors across various cancers. This approach was previously demonstrated to outperform other deep learning methods, graph kernel models, molecular docking, and drug binding pocket matching. METHODS CancerOmicsNet integrates multiple heterogeneous data by utilizing a deep graph learning model with sophisticated attention propagation mechanisms to extract highly predictive features from cancer-specific networks. The AI-based system was devised to provide more accurate and robust predictions than data-driven therapeutic discovery using gene signature reversion. RESULTS Selected CancerOmicsNet predictions obtained for "unseen" data are positively validated against the biomedical literature and by live-cell time course inhibition assays performed against breast, pancreatic, and prostate cancer cell lines. Encouragingly, six molecules exhibited dose-dependent antiproliferative activities, with pan-CDK inhibitor JNJ-7706621 and Src inhibitor PP1 being the most potent against the pancreatic cancer cell line Panc 04.03. CONCLUSIONS CancerOmicsNet is a promising AI-based platform to help guide the development of new approaches in precision oncology involving a variety of tumor types and therapeutics.
Collapse
Affiliation(s)
- Manali Singha
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Limeng Pu
- grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Brent A. Stanfield
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Ifeanyi K. Uche
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.279863.10000 0000 8954 1233School of Medicine, Louisiana State University Health Sciences Center, New Orleans, LA 70112 USA
| | - Paul J. F. Rider
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Konstantin G. Kousoulas
- grid.64337.350000 0001 0662 7451Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Biotechnology and Molecular Medicine, Department of Pathobiological Sciences, School of Veterinary Medicine, Louisiana State University, Baton Rouge, LA 70803 USA
| | - J. Ramanujam
- grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803 USA
| | - Michal Brylinski
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803 USA ,grid.64337.350000 0001 0662 7451Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803 USA
| |
Collapse
|
23
|
Lan AY, Corces MR. Deep learning approaches for noncoding variant prioritization in neurodegenerative diseases. Front Aging Neurosci 2022; 14:1027224. [PMID: 36466610 PMCID: PMC9716280 DOI: 10.3389/fnagi.2022.1027224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 10/24/2022] [Indexed: 11/19/2022] Open
Abstract
Determining how noncoding genetic variants contribute to neurodegenerative dementias is fundamental to understanding disease pathogenesis, improving patient prognostication, and developing new clinical treatments. Next generation sequencing technologies have produced vast amounts of genomic data on cell type-specific transcription factor binding, gene expression, and three-dimensional chromatin interactions, with the promise of providing key insights into the biological mechanisms underlying disease. However, this data is highly complex, making it challenging for researchers to interpret, assimilate, and dissect. To this end, deep learning has emerged as a powerful tool for genome analysis that can capture the intricate patterns and dependencies within these large datasets. In this review, we organize and discuss the many unique model architectures, development philosophies, and interpretation methods that have emerged in the last few years with a focus on using deep learning to predict the impact of genetic variants on disease pathogenesis. We highlight both broadly-applicable genomic deep learning methods that can be fine-tuned to disease-specific contexts as well as existing neurodegenerative disease research, with an emphasis on Alzheimer's-specific literature. We conclude with an overview of the future of the field at the intersection of neurodegeneration, genomics, and deep learning.
Collapse
Affiliation(s)
- Alexander Y. Lan
- Gladstone Institute of Neurological Disease, San Francisco, CA, United States
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - M. Ryan Corces
- Gladstone Institute of Neurological Disease, San Francisco, CA, United States
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| |
Collapse
|
24
|
Kalyakulina A, Yusipov I, Bacalini MG, Franceschi C, Vedunova M, Ivanchenko M. Disease classification for whole-blood DNA methylation: Meta-analysis, missing values imputation, and XAI. Gigascience 2022; 11:giac097. [PMID: 36259657 PMCID: PMC9718659 DOI: 10.1093/gigascience/giac097] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/01/2022] [Accepted: 09/15/2022] [Indexed: 07/25/2023] Open
Abstract
BACKGROUND DNA methylation has a significant effect on gene expression and can be associated with various diseases. Meta-analysis of available DNA methylation datasets requires development of a specific workflow for joint data processing. RESULTS We propose a comprehensive approach of combined DNA methylation datasets to classify controls and patients. The solution includes data harmonization, construction of machine learning classification models, dimensionality reduction of models, imputation of missing values, and explanation of model predictions by explainable artificial intelligence (XAI) algorithms. We show that harmonization can improve classification accuracy by up to 20% when preprocessing methods of the training and test datasets are different. The best accuracy results were obtained with tree ensembles, reaching above 95% for Parkinson's disease. Dimensionality reduction can substantially decrease the number of features, without detriment to the classification accuracy. The best imputation methods achieve almost the same classification accuracy for data with missing values as for the original data. XAI approaches have allowed us to explain model predictions from both populational and individual perspectives. CONCLUSIONS We propose a methodologically valid and comprehensive approach to the classification of healthy individuals and patients with various diseases based on whole-blood DNA methylation data using Parkinson's disease and schizophrenia as examples. The proposed algorithm works better for the former pathology, characterized by a complex set of symptoms. It allows to solve data harmonization problems for meta-analysis of many different datasets, impute missing values, and build classification models of small dimensionality.
Collapse
Affiliation(s)
- Alena Kalyakulina
- Correspondence author. Alena Kalyakulina, Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, Gagarin avenue 22, Nizhny Novgorod 603022, Russia. E-mail:
| | | | | | - Claudio Franceschi
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Maria Vedunova
- Institute of Biology and Biomedicine, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Mikhail Ivanchenko
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| |
Collapse
|
25
|
Classification and Interpretability of Mild Cognitive Impairment Based on Resting-State Functional Magnetic Resonance and Ensemble Learning. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2535954. [PMID: 36035823 PMCID: PMC9417789 DOI: 10.1155/2022/2535954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 06/12/2022] [Accepted: 07/06/2022] [Indexed: 11/22/2022]
Abstract
The combination and integration of multimodal imaging and clinical markers have introduced numerous classifiers to improve diagnostic accuracy in detecting and predicting AD; however, many studies cannot ensure the homogeneity of data sets and consistency of results. In our study, the XGBoost algorithm was used to classify mild cognitive impairment (MCI) and normal control (NC) populations through five rs-fMRI analysis datasets. Shapley Additive exPlanations (SHAP) is used to analyze the interpretability of the model. The highest accuracy for diagnosing MCI was 65.14% (using the mPerAF dataset). The characteristics of the left insula, right middle frontal gyrus, and right cuneus correlated positively with the output value using DC datasets. The characteristics of left cerebellum 6, right inferior frontal gyrus, opercular part, and vermis 6 correlated positively with the output value using fALFF datasets. The characteristics of the right middle temporal gyrus, left middle temporal gyrus, left temporal pole, and middle temporal gyrus correlated positively with the output value using mPerAF datasets. The characteristics of the right middle temporal gyrus, left middle temporal gyrus, and left hippocampus correlated positively with the output value using PerAF datasets. The characteristics of left cerebellum 9, vermis 9, and right precentral gyrus, right amygdala, and left middle occipital gyrus correlated positively with the output value using Wavelet-ALFF datasets. We found that the XGBoost algorithm constructed from rs-fMRI data is effective for the diagnosis and classification of MCI. The accuracy rates obtained by different rs-fMRI data analysis methods are similar, but the important features are different and involve multiple brain regions, which suggests that MCI may have a negative impact on brain function.
Collapse
|
26
|
A CAD System for Alzheimer's Disease Classification Using Neuroimaging MRI 2D Slices. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8680737. [PMID: 35983528 PMCID: PMC9381208 DOI: 10.1155/2022/8680737] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 04/21/2022] [Accepted: 05/27/2022] [Indexed: 11/22/2022]
Abstract
Developments in medical care have inspired wide interest in the current decade, especially to their services to individuals living prolonged and healthier lives. Alzheimer's disease (AD) is the most chronic neurodegeneration and dementia-causing disorder. Economic expense of treating AD patients is expected to grow. The requirement of developing a computer-aided technique for early AD categorization becomes even more essential. Deep learning (DL) models offer numerous benefits against machine learning tools. Several latest experiments that exploited brain magnetic resonance imaging (MRI) scans and convolutional neural networks (CNN) for AD classification showed promising conclusions. CNN's receptive field aids in the extraction of main recognizable features from these MRI scans. In order to increase classification accuracy, a new adaptive model based on CNN and support vector machines (SVM) is presented in the research, combining both the CNN's capabilities in feature extraction and SVM in classification. The objective of this research is to build a hybrid CNN-SVM model for classifying AD using the MRI ADNI dataset. Experimental results reveal that the hybrid CNN-SVM model outperforms the CNN model alone, with relative improvements of 3.4%, 1.09%, 0.85%, and 2.82% on the testing dataset for AD vs. cognitive normal (CN), CN vs. mild cognitive impairment (MCI), AD vs. MCI, and CN vs. MCI vs. AD, respectively. Finally, the proposed approach has been further experimented on OASIS dataset leading to accuracy of 86.2%.
Collapse
|