1
|
Garg M, Karpinski M, Matelska D, Middleton L, Burren OS, Hu F, Wheeler E, Smith KR, Fabre MA, Mitchell J, O'Neill A, Ashley EA, Harper AR, Wang Q, Dhindsa RS, Petrovski S, Vitsios D. Disease prediction with multi-omics and biomarkers empowers case-control genetic discoveries in the UK Biobank. Nat Genet 2024; 56:1821-1831. [PMID: 39261665 PMCID: PMC11390475 DOI: 10.1038/s41588-024-01898-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 08/07/2024] [Indexed: 09/13/2024]
Abstract
The emergence of biobank-level datasets offers new opportunities to discover novel biomarkers and develop predictive algorithms for human disease. Here, we present an ensemble machine-learning framework (machine learning with phenotype associations, MILTON) utilizing a range of biomarkers to predict 3,213 diseases in the UK Biobank. Leveraging the UK Biobank's longitudinal health record data, MILTON predicts incident disease cases undiagnosed at time of recruitment, largely outperforming available polygenic risk scores. We further demonstrate the utility of MILTON in augmenting genetic association analyses in a phenome-wide association study of 484,230 genome-sequenced samples, along with 46,327 samples with matched plasma proteomics data. This resulted in improved signals for 88 known (P < 1 × 10-8) gene-disease relationships alongside 182 gene-disease relationships that did not achieve genome-wide significance in the nonaugmented baseline cohorts. We validated these discoveries in the FinnGen biobank alongside two orthogonal machine-learning methods built for gene-disease prioritization. All extracted gene-disease associations and incident disease predictive biomarkers are publicly available ( http://milton.public.cgr.astrazeneca.com ).
Collapse
Affiliation(s)
- Manik Garg
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Marcin Karpinski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Dorota Matelska
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Lawrence Middleton
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Oliver S Burren
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Fengyuan Hu
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Eleanor Wheeler
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Katherine R Smith
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Margarete A Fabre
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
- Department of Haematology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Jonathan Mitchell
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Amanda O'Neill
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Clindatapark Ltd, Babraham, Cambridge, UK
| | - Euan A Ashley
- Division of Cardiology, Department of Medicine, Stanford University, Palo Alto, CA, USA
| | - Andrew R Harper
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Clinical Development, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Ryan S Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
- Department of Medicine, Austin Health, University of Melbourne, Melbourne, Victoria, Australia.
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| |
Collapse
|
2
|
Wenteler A, Cabrera CP, Wei W, Neduva V, Barnes MR. AI approaches for the discovery and validation of drug targets. CAMBRIDGE PRISMS. PRECISION MEDICINE 2024; 2:e7. [PMID: 39258224 PMCID: PMC11383977 DOI: 10.1017/pcm.2024.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 05/04/2024] [Accepted: 05/08/2024] [Indexed: 09/12/2024]
Abstract
Artificial intelligence (AI) holds immense promise for accelerating and improving all aspects of drug discovery, not least target discovery and validation. By integrating a diverse range of biological data modalities, AI enables the accurate prediction of drug target properties, ultimately illuminating biological mechanisms of disease and guiding drug discovery strategies. Despite the indisputable potential of AI in drug target discovery, there are many challenges and obstacles yet to be overcome, including dealing with data biases, model interpretability and generalisability, and the validation of predicted drug targets, to name a few. By exploring recent advancements in AI, this review showcases current applications of AI for drug target discovery and offers perspectives on the future of AI for the discovery and validation of drug targets, paving the way for the generation of novel and safer pharmaceuticals.
Collapse
Affiliation(s)
- Aaron Wenteler
- Digital Environment Research Institute, Queen Mary University of London, London, United Kingdom
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- MSD Discovery Centre, London, United Kingdom
| | - Claudia P Cabrera
- Digital Environment Research Institute, Queen Mary University of London, London, United Kingdom
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- NIHR Barts Cardiovascular Biomedical Research Centre, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom
| | - Wei Wei
- MSD Discovery Centre, London, United Kingdom
| | | | - Michael R Barnes
- Digital Environment Research Institute, Queen Mary University of London, London, United Kingdom
- Centre for Translational Bioinformatics, William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- NIHR Barts Cardiovascular Biomedical Research Centre, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| |
Collapse
|
3
|
Middleton L, Melas I, Vasavda C, Raies A, Rozemberczki B, Dhindsa RS, Dhindsa JS, Weido B, Wang Q, Harper AR, Edwards G, Petrovski S, Vitsios D. Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data. SCIENCE ADVANCES 2024; 10:eadj1424. [PMID: 38718126 PMCID: PMC11078195 DOI: 10.1126/sciadv.adj1424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 04/04/2024] [Indexed: 05/12/2024]
Abstract
The ongoing expansion of human genomic datasets propels therapeutic target identification; however, extracting gene-disease associations from gene annotations remains challenging. Here, we introduce Mantis-ML 2.0, a framework integrating AstraZeneca's Biological Insights Knowledge Graph and numerous tabular datasets, to assess gene-disease probabilities throughout the phenome. We use graph neural networks, capturing the graph's holistic structure, and train them on hundreds of balanced datasets via a robust semi-supervised learning framework to provide gene-disease probabilities across the human exome. Mantis-ML 2.0 incorporates natural language processing to automate disease-relevant feature selection for thousands of diseases. The enhanced models demonstrate a 6.9% average classification power boost, achieving a median receiver operating characteristic (ROC) area under curve (AUC) score of 0.90 across 5220 diseases from Human Phenotype Ontology, OpenTargets, and Genomics England. Notably, Mantis-ML 2.0 prioritizes associations from an independent UK Biobank phenome-wide association study (PheWAS), providing a stronger form of triaging and mitigating against underpowered PheWAS associations. Results are exposed through an interactive web resource.
Collapse
Affiliation(s)
- Lawrence Middleton
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ioannis Melas
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Chirag Vasavda
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA 02451, USA
| | - Arwa Raies
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Benedek Rozemberczki
- Biological Insights Knowledge Graph (BIKG), Research D&A, R&D IT, AstraZeneca, Cambridge, UK
| | - Ryan S. Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA 02451, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, TX 77030, USA
| | - Justin S. Dhindsa
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Blake Weido
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA 02451, USA
| | - Andrew R. Harper
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Gavin Edwards
- Biological Insights Knowledge Graph (BIKG), Research D&A, R&D IT, AstraZeneca, Cambridge, UK
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Department of Medicine, University of Melbourne, Austin Health, Melbourne, Victoria, Australia
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| |
Collapse
|
4
|
Sigala RE, Lagou V, Shmeliov A, Atito S, Kouchaki S, Awais M, Prokopenko I, Mahdi A, Demirkan A. Machine Learning to Advance Human Genome-Wide Association Studies. Genes (Basel) 2023; 15:34. [PMID: 38254924 PMCID: PMC10815885 DOI: 10.3390/genes15010034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024] Open
Abstract
Machine learning, including deep learning, reinforcement learning, and generative artificial intelligence are revolutionising every area of our lives when data are made available. With the help of these methods, we can decipher information from larger datasets while addressing the complex nature of biological systems in a more efficient way. Although machine learning methods have been introduced to human genetic epidemiological research as early as 2004, those were never used to their full capacity. In this review, we outline some of the main applications of machine learning to assigning human genetic loci to health outcomes. We summarise widely used methods and discuss their advantages and challenges. We also identify several tools, such as Combi, GenNet, and GMSTool, specifically designed to integrate these methods for hypothesis-free analysis of genetic variation data. We elaborate on the additional value and limitations of these tools from a geneticist's perspective. Finally, we discuss the fast-moving field of foundation models and large multi-modal omics biobank initiatives.
Collapse
Affiliation(s)
- Rafaella E. Sigala
- Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
| | - Vasiliki Lagou
- Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
| | - Aleksey Shmeliov
- Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
| | - Sara Atito
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
| | - Samaneh Kouchaki
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
| | - Muhammad Awais
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
| | - Inga Prokopenko
- Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
| | - Adam Mahdi
- Oxford Internet Institute, University of Oxford, Oxford OX1 3JS, Oxfordshire, UK;
| | - Ayse Demirkan
- Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
- Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
| |
Collapse
|
5
|
Roman-Naranjo P, Parra-Perez AM, Lopez-Escamez JA. A systematic review on machine learning approaches in the diagnosis and prognosis of rare genetic diseases. J Biomed Inform 2023:104429. [PMID: 37352901 DOI: 10.1016/j.jbi.2023.104429] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 06/05/2023] [Accepted: 06/17/2023] [Indexed: 06/25/2023]
Abstract
BACKGROUND The diagnosis of rare genetic diseases is often challenging due to the complexity of the genetic underpinnings of these conditions and the limited availability of diagnostic tools. Machine learning (ML) algorithms have the potential to improve the accuracy and speed of diagnosis by analyzing large amounts of genomic data and identifying complex multiallelic patterns that may be associated with specific diseases. In this systematic review, we aimed to identify the methodological trends and the ML application areas in rare genetic diseases. METHODS We performed a systematic review of the literature following the PRISMA guidelines to search studies that used ML approaches to enhance the diagnosis of rare genetic diseases. Studies that used DNA-based sequencing data and a variety of ML algorithms were included, summarized, and analyzed using bibliometric methods, visualization tools, and a feature co-occurrence analysis. FINDINGS Our search identified 22 studies that met the inclusion criteria. We found that exome sequencing was the most frequently used sequencing technology (59%), and rare neoplastic diseases were the most prevalent disease scenario (59%). In rare neoplasms, the most frequent applications of ML models were the differential diagnosis or stratification of patients (38.5%) and the identification of somatic mutations (30.8%). In other rare diseases, the most frequent goals were the prioritization of rare variants or genes (55.5%) and the identification of biallelic or digenic inheritance (33.3%). The most employed method was the random forest algorithm (54.5%). In addition, the features of the datasets needed for training these algorithms were distinctive depending on the goal pursued, including the mutational load in each gene for the differential diagnosis of patients, or the combination of genotype features and sequence-derived features (such as GC-content) for the identification of somatic mutations. CONCLUSIONS ML algorithms based on sequencing data are mainly used for the diagnosis of rare neoplastic diseases, with random forest being the most common approach. We identified key features in the datasets used for training these ML models according to the objective pursued. These features can support the development of future ML models in the diagnosis of rare genetic diseases.
Collapse
Affiliation(s)
- P Roman-Naranjo
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain.
| | - A M Parra-Perez
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain
| | - J A Lopez-Escamez
- Division of Otolaryngology, Department of Surgery, Instituto de Investigación Biosanitaria, ibs.GRANADA, Universidad de Granada, Granada, Spain; Otology and Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer, University of Granada, Junta de Andalucía, PTS, Granada, Spain; Sensorineural Pathology Programme, Centro de Investigación Biomédica en Red en Enfermedades Raras, CIBERER, Madrid, Spain; Meniere's Disease Neuroscience Research Program, Faculty of Medicine & Health, School of Medical Sciences, The Kolling Institute, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
6
|
Optimal gene prioritization and disease prediction using knowledge based ontology structure. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
7
|
Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review. J Nephrol 2023; 36:1101-1117. [PMID: 36786976 PMCID: PMC10227138 DOI: 10.1007/s40620-023-01573-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 01/01/2023] [Indexed: 02/15/2023]
Abstract
OBJECTIVES In this systematic review we aimed at assessing how artificial intelligence (AI), including machine learning (ML) techniques have been deployed to predict, diagnose, and treat chronic kidney disease (CKD). We systematically reviewed the available evidence on these innovative techniques to improve CKD diagnosis and patient management. METHODS We included English language studies retrieved from PubMed. The review is therefore to be classified as a "rapid review", since it includes one database only, and has language restrictions; the novelty and importance of the issue make missing relevant papers unlikely. We extracted 16 variables, including: main aim, studied population, data source, sample size, problem type (regression, classification), predictors used, and performance metrics. We followed the Preferred Reporting Items for Systematic Reviews (PRISMA) approach; all main steps were done in duplicate. RESULTS From a total of 648 studies initially retrieved, 68 articles met the inclusion criteria. Models, as reported by authors, performed well, but the reported metrics were not homogeneous across articles and therefore direct comparison was not feasible. The most common aim was prediction of prognosis, followed by diagnosis of CKD. Algorithm generalizability, and testing on diverse populations was rarely taken into account. Furthermore, the clinical evaluation and validation of the models/algorithms was perused; only a fraction of the included studies, 6 out of 68, were performed in a clinical context. CONCLUSIONS Machine learning is a promising tool for the prediction of risk, diagnosis, and therapy management for CKD patients. Nonetheless, future work is needed to address the interpretability, generalizability, and fairness of the models to ensure the safe application of such technologies in routine clinical practice.
Collapse
|
8
|
Lee YY, Endale M, Wu G, Ruben MD, Francey LJ, Morris AR, Choo NY, Anafi RC, Smith DF, Liu AC, Hogenesch JB. Integration of genome-scale data identifies candidate sleep regulators. Sleep 2023; 46:zsac279. [PMID: 36462188 PMCID: PMC9905783 DOI: 10.1093/sleep/zsac279] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/02/2022] [Indexed: 12/05/2022] Open
Abstract
STUDY OBJECTIVES Genetics impacts sleep, yet, the molecular mechanisms underlying sleep regulation remain elusive. In this study, we built machine learning models to predict sleep genes based on their similarity to genes that are known to regulate sleep. METHODS We trained a prediction model on thousands of published datasets, representing circadian, immune, sleep deprivation, and many other processes, using a manually curated list of 109 sleep genes. RESULTS Our predictions fit with prior knowledge of sleep regulation and identified key genes and pathways to pursue in follow-up studies. As an example, we focused on the NF-κB pathway and showed that chronic activation of NF-κB in a genetic mouse model impacted the sleep-wake patterns. CONCLUSION Our study highlights the power of machine learning in integrating prior knowledge and genome-wide data to study genetic regulation of complex behaviors such as sleep.
Collapse
Affiliation(s)
- Yin Yeng Lee
- Divisions of Human Genetics and Immunobiology, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, 45229, USA
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Mehari Endale
- Department of Physiology and Aging, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Gang Wu
- Divisions of Human Genetics and Immunobiology, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Marc D Ruben
- Divisions of Human Genetics and Immunobiology, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Lauren J Francey
- Divisions of Human Genetics and Immunobiology, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Andrew R Morris
- Department of Physiology and Aging, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Natalie Y Choo
- Division of Pediatric Otolaryngology-Head and Neck Surgery, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Ron C Anafi
- Department of Medicine, Chronobiology and Sleep Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - David F Smith
- Division of Pediatric Otolaryngology-Head and Neck Surgery, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Division of Pulmonary Medicine and the Sleep Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Center for Circadian Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Otolaryngology - Head and Neck Surgery, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Andrew C Liu
- Department of Physiology and Aging, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - John B Hogenesch
- Divisions of Human Genetics and Immunobiology, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, 45229, USA
- Center for Circadian Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
| |
Collapse
|
9
|
Raies A, Tulodziecka E, Stainer J, Middleton L, Dhindsa RS, Hill P, Engkvist O, Harper AR, Petrovski S, Vitsios D. DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets. Commun Biol 2022; 5:1291. [PMID: 36434048 PMCID: PMC9700683 DOI: 10.1038/s42003-022-04245-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 11/09/2022] [Indexed: 11/27/2022] Open
Abstract
The druggability of targets is a crucial consideration in drug target selection. Here, we adopt a stochastic semi-supervised ML framework to develop DrugnomeAI, which estimates the druggability likelihood for every protein-coding gene in the human exome. DrugnomeAI integrates gene-level properties from 15 sources resulting in 324 features. The tool generates exome-wide predictions based on labelled sets of known drug targets (median AUC: 0.97), highlighting features from protein-protein interaction networks as top predictors. DrugnomeAI provides generic as well as specialised models stratified by disease type or drug therapeutic modality. The top-ranking DrugnomeAI genes were significantly enriched for genes previously selected for clinical development programs (p value < 1 × 10-308) and for genes achieving genome-wide significance in phenome-wide association studies of 450 K UK Biobank exomes for binary (p value = 1.7 × 10-5) and quantitative traits (p value = 1.6 × 10-7). We accompany our method with a web application ( http://drugnomeai.public.cgr.astrazeneca.com ) to visualise the druggability predictions and the key features that define gene druggability, per disease type and modality.
Collapse
Affiliation(s)
- Arwa Raies
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ewa Tulodziecka
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - James Stainer
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Lawrence Middleton
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ryan S Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, USA
| | - Pamela Hill
- Emerging Innovations, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Andrew R Harper
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Department of Medicine, University of Melbourne, Austin Health, Melbourne, VIC, Australia
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| |
Collapse
|
10
|
Das T, Kaur H, Gour P, Prasad K, Lynn AM, Prakash A, Kumar V. Intersection of network medicine and machine learning towards investigating the key biomarkers and pathways underlying amyotrophic lateral sclerosis: a systematic review. Brief Bioinform 2022; 23:6780269. [PMID: 36411673 DOI: 10.1093/bib/bbac442] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/12/2022] [Accepted: 09/13/2022] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Network medicine is an emerging area of research that focuses on delving into the molecular complexity of the disease, leading to the discovery of network biomarkers and therapeutic target discovery. Amyotrophic lateral sclerosis (ALS) is a complicated rare disease with unknown pathogenesis and no available treatment. In ALS, network properties appear to be potential biomarkers that can be beneficial in disease-related applications when explored independently or in tandem with machine learning (ML) techniques. OBJECTIVE This systematic literature review explores recent trends in network medicine and implementations of network-based ML algorithms in ALS. We aim to provide an overview of the identified primary studies and gather details on identifying the potential biomarkers and delineated pathways. METHODS The current study consists of searching for and investigating primary studies from PubMed and Dimensions.ai, published between 2018 and 2022 that reported network medicine perspectives and the coupling of ML techniques. Each abstract and full-text study was individually evaluated, and the relevant studies were finally included in the review for discussion once they met the inclusion and exclusion criteria. RESULTS We identified 109 eligible publications from primary studies representing this systematic review. The data coalesced into two themes: application of network science to identify disease modules and promising biomarkers in ALS, along with network-based ML approaches. Conclusion This systematic review gives an overview of the network medicine approaches and implementations of network-based ML algorithms in ALS to determine new disease genes, and identify critical pathways and therapeutic target discovery for personalized treatment.
Collapse
Affiliation(s)
- Trishala Das
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi-110067, India
| | - Harbinder Kaur
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi-110067, India
| | - Pratibha Gour
- Dept. of Plant Molecular Biology, University of Delhi, South Campus, New Delhi-110021, India
| | - Kartikay Prasad
- Amity Institute of Neuropsychology & Neurosciences (AINN), Amity University, Noida, UP-201303, India
| | - Andrew M Lynn
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi-110067, India
| | - Amresh Prakash
- Amity Institute of Integrative Sciences and Health, Amity University Haryana, Gurgaon-122413, India
| | - Vijay Kumar
- Amity Institute of Neuropsychology & Neurosciences (AINN), Amity University, Noida, UP-201303, India
| |
Collapse
|
11
|
Nag A, Dhindsa RS, Mitchell J, Vasavda C, Harper AR, Vitsios D, Ahnmark A, Bilican B, Madeyski-Bengtson K, Zarrouki B, Zoghbi AW, Wang Q, Smith KR, Alegre-Díaz J, Kuri-Morales P, Berumen J, Tapia-Conyer R, Emberson J, Torres JM, Collins R, Smith DM, Challis B, Paul DS, Bohlooly-Y M, Snowden M, Baker D, Fritsche-Danielson R, Pangalos MN, Petrovski S. Human genetics uncovers MAP3K15 as an obesity-independent therapeutic target for diabetes. SCIENCE ADVANCES 2022; 8:eadd5430. [PMID: 36383675 PMCID: PMC9668288 DOI: 10.1126/sciadv.add5430] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/27/2022] [Indexed: 05/30/2023]
Abstract
We performed collapsing analyses on 454,796 UK Biobank (UKB) exomes to detect gene-level associations with diabetes. Recessive carriers of nonsynonymous variants in MAP3K15 were 30% less likely to develop diabetes (P = 5.7 × 10-10) and had lower glycosylated hemoglobin (β = -0.14 SD units, P = 1.1 × 10-24). These associations were independent of body mass index, suggesting protection against insulin resistance even in the setting of obesity. We replicated these findings in 96,811 Admixed Americans in the Mexico City Prospective Study (P < 0.05)Moreover, the protective effect of MAP3K15 variants was stronger in individuals who did not carry the Latino-enriched SLC16A11 risk haplotype (P = 6.0 × 10-4). Separately, we identified a Finnish-enriched MAP3K15 protein-truncating variant associated with decreased odds of both type 1 and type 2 diabetes (P < 0.05) in FinnGen. No adverse phenotypes were associated with protein-truncating MAP3K15 variants in the UKB, supporting this gene as a therapeutic target for diabetes.
Collapse
Affiliation(s)
- Abhishek Nag
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ryan S. Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Jonathan Mitchell
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Chirag Vasavda
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Andrew R. Harper
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Andrea Ahnmark
- Bioscience Metabolism, Early CVRM, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Bilada Bilican
- Discovery Biology, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Katja Madeyski-Bengtson
- Discovery Biology, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Bader Zarrouki
- Bioscience Metabolism, Early CVRM, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Anthony W. Zoghbi
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Katherine R. Smith
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Jesus Alegre-Díaz
- Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Coyoacán, 4360 Ciudad de México, Mexico
| | - Pablo Kuri-Morales
- Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Coyoacán, 4360 Ciudad de México, Mexico
| | - Jaime Berumen
- Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Coyoacán, 4360 Ciudad de México, Mexico
| | - Roberto Tapia-Conyer
- Faculty of Medicine, National Autonomous University of Mexico, Copilco Universidad, Coyoacán, 4360 Ciudad de México, Mexico
| | - Jonathan Emberson
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, England, UK
| | - Jason M. Torres
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, England, UK
| | - Rory Collins
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, England, UK
| | - David M. Smith
- Emerging Innovations, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Benjamin Challis
- Translational Science and Experimental Medicine, Early CVRM, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Dirk S. Paul
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Mohammad Bohlooly-Y
- Discovery Biology, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Mike Snowden
- Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - David Baker
- Bioscience Metabolism, Early CVRM, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | | | | | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Department of Medicine, University of Melbourne, Austin Health, Melbourne, Victoria, Australia
| |
Collapse
|
12
|
Buvall L, Menzies RI, Williams J, Woollard KJ, Kumar C, Granqvist AB, Fritsch M, Feliers D, Reznichenko A, Gianni D, Petrovski S, Bendtsen C, Bohlooly-Y M, Haefliger C, Danielson RF, Hansen PBL. Selecting the right therapeutic target for kidney disease. Front Pharmacol 2022; 13:971065. [DOI: 10.3389/fphar.2022.971065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
Kidney disease is a complex disease with several different etiologies and underlying associated pathophysiology. This is reflected by the lack of effective treatment therapies in chronic kidney disease (CKD) that stop disease progression. However, novel strategies, recent scientific breakthroughs, and technological advances have revealed new possibilities for finding novel disease drivers in CKD. This review describes some of the latest advances in the field and brings them together in a more holistic framework as applied to identification and validation of disease drivers in CKD. It uses high-resolution ‘patient-centric’ omics data sets, advanced in silico tools (systems biology, connectivity mapping, and machine learning) and ‘state-of-the-art‘ experimental systems (complex 3D systems in vitro, CRISPR gene editing, and various model biological systems in vivo). Application of such a framework is expected to increase the likelihood of successful identification of novel drug candidates based on strong human target validation and a better scientific understanding of underlying mechanisms.
Collapse
|
13
|
Ye C, Swiers R, Bonner S, Barrett I. A Knowledge Graph-Enhanced Tensor Factorisation Model for Discovering Drug Targets. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3070-3080. [PMID: 35939454 DOI: 10.1109/tcbb.2022.3197320] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The drug discovery and development process is a long and expensive one, costing over 1 billion USD on average per drug and taking 10-15 years. To reduce the high levels of attrition throughout the process, there has been a growing interest in applying machine learning methodologies to various stages of drug discovery and development in the recent decade, especially at the earliest stage - identification of druggable disease genes. In this paper, we have developed a new tensor factorisation model to predict potential drug targets (genes or proteins) for treating diseases. We created a three-dimensional data tensor consisting of 1,048 gene targets, 860 diseases and 230,011 evidence attributes and clinical outcomes connecting them, using data extracted from the Open Targets and PharmaProjects databases. We enriched the data with gene target representations learned from a drug discovery-oriented knowledge graph and applied our proposed method to predict the clinical outcomes for unseen gene target and disease pairs. We designed three evaluation strategies to measure the prediction performance and benchmarked several commonly used machine learning classifiers together with Bayesian matrix and tensor factorisation methods. The result shows that incorporating knowledge graph embeddings significantly improves the prediction accuracy and that training tensor factorisation alongside a dense neural network outperforms all other baselines. In summary, our framework combines two actively studied machine learning approaches to disease target identification, namely tensor factorisation and knowledge graph representation learning, which could be a promising avenue for further exploration in data-driven drug discovery.
Collapse
|
14
|
Tang Z, Zhang F, Wang Y, Zhang C, Li X, Yin M, Shu J, Yu H, Liu X, Guo Y, Li Z. Diagnosis of hepatocellular carcinoma based on salivary protein glycopatterns and machine learning algorithms. Clin Chem Lab Med 2022; 60:1963-1973. [DOI: 10.1515/cclm-2022-0715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 09/08/2022] [Indexed: 11/15/2022]
Abstract
Abstract
Objectives
Hepatocellular carcinoma (HCC) is difficult to diagnose early and progresses rapidly, making it one of the most deadly malignancies worldwide. This study aimed to evaluate whether salivary glycopattern changes combined with machine learning algorithms could help in the accurate diagnosis of HCC.
Methods
Firstly, we detected the alteration of salivary glycopatterns by lectin microarrays in 118 saliva samples. Subsequently, we constructed diagnostic models for hepatic cirrhosis (HC) and HCC using three machine learning algorithms: Least Absolute Shrinkage and Selector Operation, Support Vector Machine (SVM), and Random Forest (RF). Finally, the performance of the diagnostic models was assessed in an independent validation cohort of 85 saliva samples by a series of evaluation metrics, including area under the receiver operator curve (AUC), accuracy, specificity, and sensitivity.
Results
We identified alterations in the expression levels of salivary glycopatterns in patients with HC and HCC. The results revealed that the glycopatterns recognized by 22 lectins showed significant differences in the saliva of HC and HCC patients and healthy volunteers. In addition, after Boruta feature selection, the best predictive performance was obtained with the RF algorithm for the construction of models for HC and HCC. The AUCs of the RF-HC model and RF-HCC model in the validation cohort were 0.857 (95% confidence interval [CI]: 0.780–0.935) and 0.886 (95% CI: 0.814–0.957), respectively.
Conclusions
Detecting alterations in salivary protein glycopatterns with lectin microarrays combined with machine learning algorithms could be an effective strategy for diagnosing HCC in the future.
Collapse
Affiliation(s)
- Zhen Tang
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Fan Zhang
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Yuan Wang
- Department of Infectious Diseases , Second Affiliated Hospital of Xi’an Jiaotong University , Xi’an , P.R. China
| | - Chen Zhang
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Xia Li
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Mengqi Yin
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Jian Shu
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Hanjie Yu
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Xiawei Liu
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| | - Yonghong Guo
- The infectious disease department , Gongli Hospital , Pudong New Area, Shanghai , P.R. China
| | - Zheng Li
- Laboratory for Functional Glycomics , College of Life Sciences, Northwest University , Xi’an , P.R. China
| |
Collapse
|
15
|
Dhindsa RS, Mattsson J, Nag A, Wang Q, Wain LV, Allen R, Wigmore EM, Ibanez K, Vitsios D, Deevi SVV, Wasilewski S, Karlsson M, Lassi G, Olsson H, Muthas D, Monkley S, Mackay A, Murray L, Young S, Haefliger C, Maher TM, Belvisi MG, Jenkins G, Molyneaux PL, Platt A, Petrovski S. Identification of a missense variant in SPDL1 associated with idiopathic pulmonary fibrosis. Commun Biol 2021; 4:392. [PMID: 33758299 PMCID: PMC7988141 DOI: 10.1038/s42003-021-01910-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 02/24/2021] [Indexed: 12/15/2022] Open
Abstract
Idiopathic pulmonary fibrosis (IPF) is a fatal disorder characterised by progressive, destructive lung scarring. Despite substantial progress, the genetic determinants of this disease remain incompletely defined. Using whole genome and whole exome sequencing data from 752 individuals with sporadic IPF and 119,055 UK Biobank controls, we performed a variant-level exome-wide association study (ExWAS) and gene-level collapsing analyses. Our variant-level analysis revealed a novel association between a rare missense variant in SPDL1 and IPF (NM_017785.5:g.169588475 G > A p.Arg20Gln; p = 2.4 × 10-7, odds ratio = 2.87, 95% confidence interval: 2.03-4.07). This signal was independently replicated in the FinnGen cohort, which contains 1028 cases and 196,986 controls (combined p = 2.2 × 10-20), firmly associating this variant as an IPF risk allele. SPDL1 encodes Spindly, a protein involved in mitotic checkpoint signalling during cell division that has not been previously described in fibrosis. To the best of our knowledge, these results highlight a novel mechanism underlying IPF, providing the potential for new therapeutic discoveries in a disease of great unmet need.
Collapse
Affiliation(s)
- Ryan S Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Johan Mattsson
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Abhishek Nag
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Louise V Wain
- Genetic Epidemiology Group, Department of Health Sciences George Davies Centre, University of Leicester, Leicester, UK
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Richard Allen
- Genetic Epidemiology Group, Department of Health Sciences George Davies Centre, University of Leicester, Leicester, UK
| | - Eleanor M Wigmore
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Kristina Ibanez
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Sri V V Deevi
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Sebastian Wasilewski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Maria Karlsson
- Lung Regeneration, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Glenda Lassi
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Henric Olsson
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Daniel Muthas
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Susan Monkley
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Alex Mackay
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Lynne Murray
- Lung Regeneration, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Simon Young
- Precision Medicine and Biosamples, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Carolina Haefliger
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Toby M Maher
- Royal Brompton Hospital, London, UK
- Hastings Centre for Pulmonary Research and Division of Pulmonary, Critical Care and Sleep Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Maria G Belvisi
- National Heart and Lung Institute, Imperial College, London, UK
- Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
- Respiratory Pharmacology Group, London, UK
| | - Gisli Jenkins
- Respiratory Research Unit, Division of Respiratory Medicine, University of Nottingham, Nottingham, UK
- National Institute for Health Research, Nottingham Biomedical Research Centre, Nottingham University Hospitals NHS Trust, Nottingham, UK
| | - Philip L Molyneaux
- Royal Brompton Hospital, London, UK.
- National Heart and Lung Institute, Imperial College, London, UK.
| | - Adam Platt
- Translational Science & Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| |
Collapse
|
16
|
Vasilopoulou C, Morris AP, Giannakopoulos G, Duguez S, Duddy W. What Can Machine Learning Approaches in Genomics Tell Us about the Molecular Basis of Amyotrophic Lateral Sclerosis? J Pers Med 2020; 10:E247. [PMID: 33256133 PMCID: PMC7712791 DOI: 10.3390/jpm10040247] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 11/21/2020] [Accepted: 11/23/2020] [Indexed: 02/07/2023] Open
Abstract
Amyotrophic Lateral Sclerosis (ALS) is the most common late-onset motor neuron disorder, but our current knowledge of the molecular mechanisms and pathways underlying this disease remain elusive. This review (1) systematically identifies machine learning studies aimed at the understanding of the genetic architecture of ALS, (2) outlines the main challenges faced and compares the different approaches that have been used to confront them, and (3) compares the experimental designs and results produced by those approaches and describes their reproducibility in terms of biological results and the performances of the machine learning models. The majority of the collected studies incorporated prior knowledge of ALS into their feature selection approaches, and trained their machine learning models using genomic data combined with other types of mined knowledge including functional associations, protein-protein interactions, disease/tissue-specific information, epigenetic data, and known ALS phenotype-genotype associations. The importance of incorporating gene-gene interactions and cis-regulatory elements into the experimental design of future ALS machine learning studies is highlighted. Lastly, it is suggested that future advances in the genomic and machine learning fields will bring about a better understanding of ALS genetic architecture, and enable improved personalized approaches to this and other devastating and complex diseases.
Collapse
Affiliation(s)
- Christina Vasilopoulou
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| | - Andrew P. Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, University of Manchester, Manchester M13 9PT, UK;
| | - George Giannakopoulos
- Institute of Informatics and Telecommunications, NCSR Demokritos, 153 10 Aghia Paraskevi, Greece;
- Science For You (SciFY) PNPC, TEPA Lefkippos-NCSR Demokritos, 27, Neapoleos, 153 41 Ag. Paraskevi, Greece
| | - Stephanie Duguez
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| | - William Duddy
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| |
Collapse
|
17
|
Carss KJ, Baranowska AA, Armisen J, Webb TR, Hamby SE, Premawardhana D, Al-Hussaini A, Wood A, Wang Q, Deevi SVV, Vitsios D, Lewis SH, Kotecha D, Bouatia-Naji N, Hesselson S, Iismaa SE, Tarr I, McGrath-Cadell L, Muller DW, Dunwoodie SL, Fatkin D, Graham RM, Giannoulatou E, Samani NJ, Petrovski S, Haefliger C, Adlam D. Spontaneous Coronary Artery Dissection: Insights on Rare Genetic Variation From Genome Sequencing. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2020; 13:e003030. [PMID: 33125268 PMCID: PMC7748045 DOI: 10.1161/circgen.120.003030] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Supplemental Digital Content is available in the text. Spontaneous coronary artery dissection (SCAD) occurs when an epicardial coronary artery is narrowed or occluded by an intramural hematoma. SCAD mainly affects women and is associated with pregnancy and systemic arteriopathies, particularly fibromuscular dysplasia. Variants in several genes, such as those causing connective tissue disorders, have been implicated; however, the genetic architecture is poorly understood. Here, we aim to better understand the diagnostic yield of rare variant genetic testing among a cohort of SCAD survivors and to identify genes or gene sets that have a significant enrichment of rare variants.
Collapse
Affiliation(s)
- Keren J Carss
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Anna A Baranowska
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Javier Armisen
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Tom R Webb
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Stephen E Hamby
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Diluka Premawardhana
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Abtehale Al-Hussaini
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Alice Wood
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Quanli Wang
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Sri V V Deevi
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Samuel H Lewis
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Deevia Kotecha
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Nabila Bouatia-Naji
- Université de Paris, Inserm UMR 970 - Paris, Centre de Recherche Cardiovasculaire, France (N.B.-N)
| | - Stephanie Hesselson
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Siiri E Iismaa
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Ingrid Tarr
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Lucy McGrath-Cadell
- St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - David W Muller
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Sally L Dunwoodie
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Diane Fatkin
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.).,Cardiology Department, St Vincent's Hospital, Darlinghurst, NSW, Australia (D.F.)
| | - Robert M Graham
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Eleni Giannoulatou
- Victor Chang Cardiac Research Institute, Darlinghurst (S.H., S.E.I., I.T., D.W.M., S.L.D., D.F., R.M.G., E.G.).,St Vincent's Clinical School, University of NSW Sydney, Kensington (S.E.I., L.M.-C., D.W.M., S.L.D., D.F., R.M.G., E.G.)
| | - Nilesh J Samani
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - Carolina Haefliger
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca (K.J.C., J.A., Q.W., S.V.V.D., D.V., S.H.L., S.P., C.H.)
| | - David Adlam
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (A.A.B., T.R.W., S.E.H., D.P., A.A.-H., A.W., D.K., N.J.S., D.A.)
| |
Collapse
|