1
|
Nóbrega IDS, Teles e Silva AL, Yokota-Moreno BY, Sertié AL. The Importance of Large-Scale Genomic Studies to Unravel Genetic Risk Factors for Autism. Int J Mol Sci 2024; 25:5816. [PMID: 38892002 PMCID: PMC11172008 DOI: 10.3390/ijms25115816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/17/2024] [Accepted: 05/21/2024] [Indexed: 06/21/2024] Open
Abstract
Autism spectrum disorder (ASD) is a common and highly heritable neurodevelopmental disorder. During the last 15 years, advances in genomic technologies and the availability of increasingly large patient cohorts have greatly expanded our knowledge of the genetic architecture of ASD and its neurobiological mechanisms. Over two hundred risk regions and genes carrying rare de novo and transmitted high-impact variants have been identified. Additionally, common variants with small individual effect size are also important, and a number of loci are now being uncovered. At the same time, these new insights have highlighted ongoing challenges. In this perspective article, we summarize developments in ASD genetic research and address the enormous impact of large-scale genomic initiatives on ASD gene discovery.
Collapse
Affiliation(s)
| | | | | | - Andréa Laurato Sertié
- Faculdade Israelita de Ciências da Saúde Albert Einstein, Hospital Israelita Albert Einstein, Rua Comendador Elias Jafet, 755. Morumbi, São Paulo 05653-000, Brazil; (I.d.S.N.); (A.L.T.e.S.); (B.Y.Y.-M.)
| |
Collapse
|
2
|
Zadok N, Ast G, Sharan R. A network-based method for associating genes with autism spectrum disorder. FRONTIERS IN BIOINFORMATICS 2024; 4:1295600. [PMID: 38525240 PMCID: PMC10960359 DOI: 10.3389/fbinf.2024.1295600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 02/26/2024] [Indexed: 03/26/2024] Open
Abstract
Autism spectrum disorder (ASD) is a highly heritable complex disease that affects 1% of the population, yet its underlying molecular mechanisms are largely unknown. Here we study the problem of predicting causal genes for ASD by combining genome-scale data with a network propagation approach. We construct a predictor that integrates multiple omic data sets that assess genomic, transcriptomic, proteomic, and phosphoproteomic associations with ASD. In cross validation our predictor yields mean area under the ROC curve of 0.87 and area under the precision-recall curve of 0.89. We further show that it outperforms previous gene-level predictors of autism association. Finally, we show that we can use the model to predict genes associated with Schizophrenia which is known to share genetic components with ASD.
Collapse
Affiliation(s)
- Neta Zadok
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Gil Ast
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
3
|
Zhong G, Choi YA, Shen Y. VBASS enables integration of single cell gene expression data in Bayesian association analysis of rare variants. Commun Biol 2023; 6:774. [PMID: 37491581 PMCID: PMC10368729 DOI: 10.1038/s42003-023-05155-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 07/18/2023] [Indexed: 07/27/2023] Open
Abstract
Rare or de novo variants have substantial contribution to human diseases, but the statistical power to identify risk genes by rare variants is generally low due to rarity of genotype data. Previous studies have shown that risk genes usually have high expression in relevant cell types, although for many conditions the identity of these cell types are largely unknown. Recent efforts in single cell atlas in human and model organisms produced large amount of gene expression data. Here we present VBASS, a Bayesian method that integrates single-cell expression and de novo variant (DNV) data to improve power of disease risk gene discovery. VBASS models disease risk prior as a function of expression profiles, approximated by deep neural networks. It learns the weights of neural networks and parameters of Gamma-Poisson likelihood models of DNV counts jointly from expression and genetics data. On simulated data, VBASS shows proper error rate control and better power than state-of-the-art methods. We applied VBASS to published datasets and identified more candidate risk genes with supports from literature or data from independent cohorts. VBASS can be generalized to integrate other types of functional genomics data in statistical genetics analysis.
Collapse
Affiliation(s)
- Guojie Zhong
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University Irving Medical Center, New York, NY, USA
| | - Yoolim A Choi
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Yufeng Shen
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA.
- JP Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
4
|
Beyreli I, Karakahya O, Cicek AE. DeepND: Deep multitask learning of gene risk for comorbid neurodevelopmental disorders. PATTERNS 2022; 3:100524. [PMID: 35845835 PMCID: PMC9278518 DOI: 10.1016/j.patter.2022.100524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/11/2022] [Accepted: 05/09/2022] [Indexed: 01/24/2023]
Abstract
Autism spectrum disorder and intellectual disability are comorbid neurodevelopmental disorders with complex genetic architectures. Despite large-scale sequencing studies, only a fraction of the risk genes was identified for both. We present a network-based gene risk prioritization algorithm, DeepND, that performs cross-disorder analysis to improve prediction by exploiting the comorbidity of autism spectrum disorder (ASD) and intellectual disability (ID) via multitask learning. Our model leverages information from human brain gene co-expression networks using graph convolutional networks, learning which spatiotemporal neurodevelopmental windows are important for disorder etiologies and improving the state-of-the-art prediction in single- and cross-disorder settings. DeepND identifies the prefrontal and motor-somatosensory cortex (PFC-MFC) brain region and periods from early- to mid-fetal and from early childhood to young adulthood as the highest neurodevelopmental risk windows for ASD and ID. We investigate ASD- and ID-associated copy-number variation (CNV) regions and report our findings for several susceptibility gene candidates. DeepND can be generalized to analyze any combinations of comorbid disorders. DeepND can co-analyze comorbid neurodevelopmental disorders to discover risk genes The approach employs multitask learning to learn shared and disorder-specific weights DeepND uses graph convolution to process gene interactions in multiple networks The model includes a mixture-of-experts model to detect informative networks
While risk-gene-discovery algorithms have complemented exome/genome-sequencing studies of neurodevelopmental disorders, they are not capable of co-analyzing multiple comorbid conditions like autism and intellectual disability. A common approach is analyzing disorders one by one and comparing the outcomes. With this approach, the method does not utilize cross-disorder interactions and is bound by limited evidence per disorder. We address this gap with a technique, Deep Neurodevelopmental Disorders (DeepND), that uses multitask learning to co-analyze data from multiple disorders to learn shared and disorder-specific patterns. DeepND includes graph convolutional neural networks that process gene-interaction information from multiple networks. DeepND also learns which networks are important for disorder etiologies. Based on this, we propose an interpretable risk-gene-discovery algorithm for neuropsychiatric disorders.
Collapse
Affiliation(s)
- Ilayda Beyreli
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
| | - Oguzhan Karakahya
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
| | - A. Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, 15213 PA, USA
- Corresponding author
| |
Collapse
|
5
|
Jiang Y, Urresti J, Pagel KA, Pramod AB, Iakoucheva LM, Radivojac P. Prioritizing de novo autism risk variants with calibrated gene- and variant-scoring models. Hum Genet 2021; 141:1595-1613. [PMID: 34549350 DOI: 10.1007/s00439-021-02356-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/26/2021] [Indexed: 12/17/2022]
Abstract
Whole-exome and whole-genome sequencing studies in autism spectrum disorder (ASD) have identified hundreds of thousands of exonic variants. Only a handful of them, primarily loss-of-function variants, have been shown to increase the risk for ASD, while the contributory roles of other variants, including most missense variants, remain unknown. New approaches that combine tissue-specific molecular profiles with patients' genetic data can thus play an important role in elucidating the functional impact of exonic variation and improve understanding of ASD pathogenesis. Here, we integrate spatio-temporal gene co-expression networks from the developing human brain and protein-protein interaction networks to first reach accurate prioritization of ASD risk genes based on their connectivity patterns with previously known high-confidence ASD risk genes. We subsequently integrate these gene scores with variant pathogenicity predictions to further prioritize individual exonic variants based on the positive-unlabeled learning framework with gene- and variant-score calibration. We demonstrate that this approach discriminates among variants between cases and controls at the high end of the prediction range. Finally, we experimentally validate our top-scoring de novo mutation NP_001243143.1:p.Phe309Ser in the sodium/potassium-transporting ATPase ATP1A3 to disrupt protein binding with different partners.
Collapse
Affiliation(s)
- Yuxiang Jiang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Jorge Urresti
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Kymberleigh A Pagel
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Institute for Computational Medicine, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Akula Bala Pramod
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA.
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
6
|
Secreted Reporter Assay Enables Quantitative and Longitudinal Monitoring of Neuronal Activity. eNeuro 2021; 8:ENEURO.0518-20.2021. [PMID: 34531280 PMCID: PMC8489021 DOI: 10.1523/eneuro.0518-20.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 08/23/2021] [Accepted: 09/08/2021] [Indexed: 11/21/2022] Open
Abstract
The ability to measure changes in neuronal activity in a quantifiable and precise manner is of fundamental importance to understand neuron development and function. Repeated monitoring of neuronal activity of the same population of neurons over several days is challenging and, typically, low-throughput. Here, we describe a new biochemical reporter assay that allows for repeated measurements of neuronal activity in a cell type-specific manner. We coupled activity-dependent elements from the Arc/Arg3.1 gene with a secreted reporter, Gaussia luciferase (Gluc), to quantify neuronal activity without sacrificing the neurons. The reporter predominantly senses calcium and NMDA receptor (NMDAR)-dependent activity. By repeatedly measuring the accumulation of the reporter in cell media, we can profile the developmental dynamics of neuronal activity in cultured neurons from male and female mice. The assay also allows for longitudinal analysis of pharmacological treatments, thus distinguishing acute from delayed responses. Moreover, conditional expression of the reporter allows for monitoring cell type-specific changes. This simple, quantitative, cost-effective, automatable, and cell type-specific activity reporter is a valuable tool to study the development of neuronal activity in normal and disease-model conditions, and to identify small molecules or protein factors that selectively modulate the activity of a specific population of neurons.
Collapse
|
7
|
Gunning M, Pavlidis P. "Guilt by association" is not competitive with genetic association for identifying autism risk genes. Sci Rep 2021; 11:15950. [PMID: 34354131 PMCID: PMC8342445 DOI: 10.1038/s41598-021-95321-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 07/16/2021] [Indexed: 12/25/2022] Open
Abstract
Discovering genes involved in complex human genetic disorders is a major challenge. Many have suggested that machine learning (ML) algorithms using gene networks can be used to supplement traditional genetic association-based approaches to predict or prioritize disease genes. However, questions have been raised about the utility of ML methods for this type of task due to biases within the data, and poor real-world performance. Using autism spectrum disorder (ASD) as a test case, we sought to investigate the question: can machine learning aid in the discovery of disease genes? We collected 13 published ASD gene prioritization studies and evaluated their performance using known and novel high-confidence ASD genes. We also investigated their biases towards generic gene annotations, like number of association publications. We found that ML methods which do not incorporate genetics information have limited utility for prioritization of ASD risk genes. These studies perform at a comparable level to generic measures of likelihood for the involvement of genes in any condition, and do not out-perform genetic association studies. Future efforts to discover disease genes should be focused on developing and validating statistical models for genetic association, specifically for association between rare variants and disease, rather than developing complex machine learning methods using complex heterogeneous biological data with unknown reliability.
Collapse
Affiliation(s)
- Margot Gunning
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
- Department of Psychiatry, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
- Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
| |
Collapse
|
8
|
Zhu N, Swietlik EM, Welch CL, Pauciulo MW, Hagen JJ, Zhou X, Guo Y, Karten J, Pandya D, Tilly T, Lutz KA, Martin JM, Treacy CM, Rosenzweig EB, Krishnan U, Coleman AW, Gonzaga-Jauregui C, Lawrie A, Trembath RC, Wilkins MR, Morrell NW, Shen Y, Gräf S, Nichols WC, Chung WK. Rare variant analysis of 4241 pulmonary arterial hypertension cases from an international consortium implicates FBLN2, PDGFD, and rare de novo variants in PAH. Genome Med 2021; 13:80. [PMID: 33971972 PMCID: PMC8112021 DOI: 10.1186/s13073-021-00891-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 04/19/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Pulmonary arterial hypertension (PAH) is a lethal vasculopathy characterized by pathogenic remodeling of pulmonary arterioles leading to increased pulmonary pressures, right ventricular hypertrophy, and heart failure. PAH can be associated with other diseases (APAH: connective tissue diseases, congenital heart disease, and others) but often the etiology is idiopathic (IPAH). Mutations in bone morphogenetic protein receptor 2 (BMPR2) are the cause of most heritable cases but the vast majority of other cases are genetically undefined. METHODS To identify new risk genes, we utilized an international consortium of 4241 PAH cases with exome or genome sequencing data from the National Biological Sample and Data Repository for PAH, Columbia University Irving Medical Center, and the UK NIHR BioResource - Rare Diseases Study. The strength of this combined cohort is a doubling of the number of IPAH cases compared to either national cohort alone. We identified protein-coding variants and performed rare variant association analyses in unrelated participants of European ancestry, including 1647 IPAH cases and 18,819 controls. We also analyzed de novo variants in 124 pediatric trios enriched for IPAH and APAH-CHD. RESULTS Seven genes with rare deleterious variants were associated with IPAH with false discovery rate smaller than 0.1: three known genes (BMPR2, GDF2, and TBX4), two recently identified candidate genes (SOX17, KDR), and two new candidate genes (fibulin 2, FBLN2; platelet-derived growth factor D, PDGFD). The new genes were identified based solely on rare deleterious missense variants, a variant type that could not be adequately assessed in either cohort alone. The candidate genes exhibit expression patterns in lung and heart similar to that of known PAH risk genes, and most variants occur in conserved protein domains. For pediatric PAH, predicted deleterious de novo variants exhibited a significant burden compared to the background mutation rate (2.45×, p = 2.5e-5). At least eight novel pediatric candidate genes carrying de novo variants have plausible roles in lung/heart development. CONCLUSIONS Rare variant analysis of a large international consortium identified two new candidate genes-FBLN2 and PDGFD. The new genes have known functions in vasculogenesis and remodeling. Trio analysis predicted that ~ 15% of pediatric IPAH may be explained by de novo variants.
Collapse
Affiliation(s)
- Na Zhu
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Emilia M Swietlik
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Carrie L Welch
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
| | - Michael W Pauciulo
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Jacob J Hagen
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Xueya Zhou
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Yicheng Guo
- Department of Systems Biology, Columbia University, New York, NY, USA
| | | | - Divya Pandya
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Tobias Tilly
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Katie A Lutz
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Jennifer M Martin
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NIHR BioResource for Translational Research, Cambridge Biomedical Campus, Cambridge, UK
| | - Carmen M Treacy
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Erika B Rosenzweig
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
| | - Usha Krishnan
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA
| | - Anna W Coleman
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | | | - Allan Lawrie
- Department of Infection, Immunity and Cardiovascular Disease, University of Sheffield, Sheffield, UK
| | - Richard C Trembath
- Department of Medical and Molecular Genetics, King's College London, London, UK
| | - Martin R Wilkins
- National Heart & Lung Institute, Imperial College London, London, UK
| | | | | | | | | | - Nicholas W Morrell
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NIHR BioResource for Translational Research, Cambridge Biomedical Campus, Cambridge, UK
- Addenbrooke's Hospital NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, UK
- Royal Papworth Hospital NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, UK
| | - Yufeng Shen
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Stefan Gräf
- Department of Medicine, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NIHR BioResource for Translational Research, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - William C Nichols
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Wendy K Chung
- Department of Pediatrics, Columbia University Irving Medical Center, 1150 St. Nicholas Avenue, Room 620, New York, NY, 10032, USA.
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
9
|
Reilly J, Gallagher L, Leader G, Shen S. Coupling of autism genes to tissue-wide expression and dysfunction of synapse, calcium signalling and transcriptional regulation. PLoS One 2020; 15:e0242773. [PMID: 33338084 PMCID: PMC7748153 DOI: 10.1371/journal.pone.0242773] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 11/09/2020] [Indexed: 12/11/2022] Open
Abstract
Autism Spectrum Disorder (ASD) is a heterogeneous disorder that is often accompanied with many co-morbidities. Recent genetic studies have identified various pathways from hundreds of candidate risk genes with varying levels of association to ASD. However, it is unknown which pathways are specific to the core symptoms or which are shared by the co-morbidities. We hypothesised that critical ASD candidates should appear widely across different scoring systems, and that comorbidity pathways should be constituted by genes expressed in the relevant tissues. We analysed the Simons Foundation for Autism Research Initiative (SFARI) database and four independently published scoring systems and identified 292 overlapping genes. We examined their mRNA expression using the Genotype-Tissue Expression (GTEx) database and validated protein expression levels using the human protein atlas (HPA) dataset. This led to clustering of the overlapping ASD genes into 2 groups; one with 91 genes primarily expressed in the central nervous system (CNS geneset) and another with 201 genes expressed in both CNS and peripheral tissues (CNS+PT geneset). Bioinformatic analyses showed a high enrichment of CNS development and synaptic transmission in the CNS geneset, and an enrichment of synapse, chromatin remodelling, gene regulation and endocrine signalling in the CNS+PT geneset. Calcium signalling and the glutamatergic synapse were found to be highly interconnected among pathways in the combined geneset. Our analyses demonstrate that 2/3 of ASD genes are expressed beyond the brain, which may impact peripheral function and involve in ASD co-morbidities, and relevant pathways may be explored for the treatment of ASD co-morbidities.
Collapse
Affiliation(s)
- Jamie Reilly
- Regenerative Medicine Institute, School of Medicine, Biomedical Science Building, National University of Ireland (NUI) Galway, Galway, Ireland
- * E-mail: (JR); (SS)
| | - Louise Gallagher
- Discipline of Psychiatry, School of Medicine, Trinity College Dublin, Dublin, Ireland
- Trinity Translational Medicine Institute, Trinity Centre for Health Sciences—Trinity College Dublin, St. James’s Hospital, Dublin, Ireland
| | - Geraldine Leader
- Irish Centre for Autism and Neurodevelopmental Research (ICAN), Department of Psychology, National University of Ireland (NUI) Galway, Galway, Ireland
| | - Sanbing Shen
- Regenerative Medicine Institute, School of Medicine, Biomedical Science Building, National University of Ireland (NUI) Galway, Galway, Ireland
- FutureNeuro Research Centre, Royal College of Surgeons in Ireland (RCSI), Dublin, Ireland
- * E-mail: (JR); (SS)
| |
Collapse
|
10
|
Lin Y, Afshar S, Rajadhyaksha AM, Potash JB, Han S. A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates. Front Genet 2020; 11:500064. [PMID: 33133139 PMCID: PMC7513695 DOI: 10.3389/fgene.2020.500064] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 08/13/2020] [Indexed: 11/17/2022] Open
Abstract
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be down-expressed in ASD brains, especially in frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example, NBEA, HERC1, and TCF20), but also indicated potentially novel candidates, such as, MYCBP2 and CAND1, which are involved in protein ubiquitination. We also showed that our method outperformed state-of-the-art scoring systems for ranking curated ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes.
Collapse
Affiliation(s)
- Ying Lin
- Department of Industrial Engineering, University of Houston, Houston, TX, United States
| | - Shiva Afshar
- Department of Industrial Engineering, University of Houston, Houston, TX, United States
| | - Anjali M Rajadhyaksha
- Division of Pediatric Neurology, Department of Pediatrics, Weill Cornell Medicine, New York, NY, United States.,Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, New York, NY, United States.,Weill Cornell Autism Research Program, Weill Cornell Medicine, New York, NY, United States
| | - James B Potash
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Shizhong Han
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, United States.,Lieber Institute for Brain Development, Baltimore, MD, United States
| |
Collapse
|
11
|
Norman U, Cicek AE. ST-Steiner: a spatio-temporal gene discovery algorithm. Bioinformatics 2020; 35:3433-3440. [PMID: 30759247 DOI: 10.1093/bioinformatics/btz110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 01/16/2019] [Accepted: 02/12/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Whole exome sequencing (WES) studies for autism spectrum disorder (ASD) could identify only around six dozen risk genes to date because the genetic architecture of the disorder is highly complex. To speed the gene discovery process up, a few network-based ASD gene discovery algorithms were proposed. Although these methods use static gene interaction networks, functional clustering of genes is bound to evolve during neurodevelopment and disruptions are likely to have a cascading effect on the future associations. Thus, approaches that disregard the dynamic nature of neurodevelopment are limited. RESULTS Here, we present a spatio-temporal gene discovery algorithm, which leverages information from evolving gene co-expression networks of neurodevelopment. The algorithm solves a prize-collecting Steiner forest-based problem on co-expression networks, adapted to model neurodevelopment and transfer information from precursor neurodevelopmental windows. The decisions made by the algorithm can be traced back, adding interpretability to the results. We apply the algorithm on ASD WES data of 3871 samples and identify risk clusters using BrainSpan co-expression networks of early- and mid-fetal periods. On an independent dataset, we show that incorporation of the temporal dimension increases the predictive power: predicted clusters are hit more and show higher enrichment in ASD-related functions compared with the state-of-the-art. AVAILABILITY AND IMPLEMENTATION The code is available at http://ciceklab.cs.bilkent.edu.tr/st-steiner. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Utku Norman
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey.,Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
12
|
Brueggeman L, Koomar T, Michaelson JJ. Forecasting risk gene discovery in autism with machine learning and genome-scale data. Sci Rep 2020; 10:4569. [PMID: 32165711 PMCID: PMC7067874 DOI: 10.1038/s41598-020-61288-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 02/10/2020] [Indexed: 01/09/2023] Open
Abstract
Genetics has been one of the most powerful windows into the biology of autism spectrum disorder (ASD). It is estimated that a thousand or more genes may confer risk for ASD when functionally perturbed, however, only around 100 genes currently have sufficient evidence to be considered true "autism risk genes". Massive genetic studies are currently underway producing data to implicate additional genes. This approach - although necessary - is costly and slow-moving, making identification of putative ASD risk genes with existing data vital. Here, we approach autism risk gene discovery as a machine learning problem, rather than a genetic association problem, by using genome-scale data as predictors to identify new genes with similar properties to established autism risk genes. This ensemble method, forecASD, integrates brain gene expression, heterogeneous network data, and previous gene-level predictors of autism association into an ensemble classifier that yields a single score indexing evidence of each gene's involvement in the etiology of autism. We demonstrate that forecASD has substantially better performance than previous predictors of autism association in three independent trio-based sequencing studies. Studying forecASD prioritized genes, we show that forecASD is a robust indicator of a gene's involvement in ASD etiology, with diverse applications to gene discovery, differential expression analysis, eQTL prioritization, and pathway enrichment analysis.
Collapse
Affiliation(s)
- Leo Brueggeman
- University of Iowa, Department of Psychiatry, Iowa City, IA, USA
- University of Iowa, Interdisciplinary Genetics Program, Iowa City, IA, USA
- University of Iowa, Medical Scientist Training Program, Iowa City, IA, USA
| | - Tanner Koomar
- University of Iowa, Department of Psychiatry, Iowa City, IA, USA
- University of Iowa, Interdisciplinary Genetics Program, Iowa City, IA, USA
| | - Jacob J Michaelson
- University of Iowa, Department of Psychiatry, Iowa City, IA, USA.
- University of Iowa, Interdisciplinary Genetics Program, Iowa City, IA, USA.
| |
Collapse
|
13
|
Zolotareva O, Kleine M. A Survey of Gene Prioritization Tools for Mendelian and Complex Human Diseases. J Integr Bioinform 2019; 16:/j/jib.ahead-of-print/jib-2018-0069/jib-2018-0069.xml. [PMID: 31494632 PMCID: PMC7074139 DOI: 10.1515/jib-2018-0069] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 07/12/2019] [Indexed: 12/16/2022] Open
Abstract
Modern high-throughput experiments provide us with numerous potential associations between genes and diseases. Experimental validation of all the discovered associations, let alone all the possible interactions between them, is time-consuming and expensive. To facilitate the discovery of causative genes, various approaches for prioritization of genes according to their relevance for a given disease have been developed. In this article, we explain the gene prioritization problem and provide an overview of computational tools for gene prioritization. Among about a hundred of published gene prioritization tools, we select and briefly describe 14 most up-to-date and user-friendly. Also, we discuss the advantages and disadvantages of existing tools, challenges of their validation, and the directions for future research.
Collapse
Affiliation(s)
- Olga Zolotareva
- Bielefeld University, Faculty of Technology and Center for Biotechnology, International Research Training Group "Computational Methods for the Analysis of the Diversity and Dynamics of Genomes" and Genome Informatics, Universitätsstraße 25, Bielefeld, Germany
| | - Maren Kleine
- Bielefeld University, Faculty of Technology, Bioinformatics/Medical Informatics Department, Universitätsstraße 25, Bielefeld, Germany
| |
Collapse
|
14
|
Thiffault I, Cadieux-Dion M, Farrow E, Caylor R, Miller N, Soden S, Saunders C. On the verge of diagnosis: Detection, reporting, and investigation of de novo variants in novel genes identified by clinical sequencing. Hum Mutat 2019; 39:1505-1516. [PMID: 30311385 DOI: 10.1002/humu.23646] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 08/16/2018] [Accepted: 08/30/2018] [Indexed: 12/11/2022]
Abstract
The variable evidence supporting gene-disease associations contributes to the difficulty of accurate variant reporting in a clinical setting. An evidence-based scoring system for evaluating the clinical validity of gene-disease associations, proposed by ClinGen, considers experimental as well as genetic evidence. De novo variants are heavily weighted, given the overall rarity in the genome and their contribution to human disease, however they are reported as "genes of unknown significance" in our center when there is insufficient evidence for the gene-disease assertion. We report a collection of 21 de novo variants in genes of unknown clinical significance ascertained via clinical testing, of which eight of 21 (38%) are predicted to cause loss of function. These genes were subjected to ClinGen scoring to assess the strength of gene-disease relationships. Using a cutoff for moderate high or strong, 10 of 21 genes now have sufficient evidence to qualify as likely pathogenic or pathogenic variants. Sharing such cases with phenotypic data is imperative to strengthen available genetic evidence to ultimately upgrade clinical validity classifications and facilitate accurate molecular diagnosis.
Collapse
Affiliation(s)
- Isabelle Thiffault
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,Department of Pathology and Laboratory Medicine, Children's Mercy Hospitals, Kansas City, Missouri.,University of Missouri-Kansas City School of Medicine, Kansas City, Missouri
| | - Maxime Cadieux-Dion
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,Department of Pathology and Laboratory Medicine, Children's Mercy Hospitals, Kansas City, Missouri
| | - Emily Farrow
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,University of Missouri-Kansas City School of Medicine, Kansas City, Missouri.,Department of Pediatrics, Children's Mercy Hospitals, Kansas City, Missouri
| | - Raymond Caylor
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,Department of Pathology and Laboratory Medicine, Children's Mercy Hospitals, Kansas City, Missouri
| | - Neil Miller
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri
| | - Sarah Soden
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,University of Missouri-Kansas City School of Medicine, Kansas City, Missouri.,Department of Pediatrics, Children's Mercy Hospitals, Kansas City, Missouri
| | - Carol Saunders
- Center for Pediatric Genomic Medicine, Children's Mercy Hospital, Kansas City, Missouri.,Department of Pathology and Laboratory Medicine, Children's Mercy Hospitals, Kansas City, Missouri.,University of Missouri-Kansas City School of Medicine, Kansas City, Missouri
| |
Collapse
|
15
|
Wang P, Zhao D, Lachman HM, Zheng D. Enriched expression of genes associated with autism spectrum disorders in human inhibitory neurons. Transl Psychiatry 2018; 8:13. [PMID: 29317598 PMCID: PMC5802446 DOI: 10.1038/s41398-017-0058-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Revised: 08/13/2017] [Accepted: 10/09/2017] [Indexed: 01/07/2023] Open
Abstract
Autism spectrum disorder (ASD) is highly heritable but genetically heterogeneous. The affected neural circuits and cell types remain unclear and may vary at different developmental stages. By analyzing multiple sets of human single cell transcriptome profiles, we found that ASD candidates showed relatively enriched gene expression in neurons, especially in inhibitory neurons. ASD candidates were also more likely to be the hubs of the co-expression gene module that is highly expressed in inhibitory neurons, a feature not detected for excitatory neurons. In addition, we found that upregulated genes in multiple ASD cortex samples were enriched with genes highly expressed in inhibitory neurons, suggesting a potential increase of inhibitory neurons and an imbalance in the ratio between excitatory and inhibitory neurons in ASD brains. Furthermore, the downstream targets of several ASD candidates, such as CHD8, EHMT1 and SATB2, also displayed enriched expression in inhibitory neurons. Taken together, our analyses of single cell transcriptomic data suggest that inhibitory neurons may be a major neuron subtype affected by the disruption of ASD gene networks, providing single cell functional evidence to support the excitatory/inhibitory (E/I) imbalance hypothesis.
Collapse
Affiliation(s)
- Ping Wang
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
| | - Dejian Zhao
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
| | - Herbert M Lachman
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
- Department of Psychiatry and Behavioral Sciences, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
- Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA
| | - Deyou Zheng
- Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA.
- Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA.
- Department of Neurology, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY, USA.
| |
Collapse
|
16
|
Dougherty JD, Yang C, Lake AM. Systems biology in the central nervous system: a brief perspective on essential recent advancements. CURRENT OPINION IN SYSTEMS BIOLOGY 2017; 3:67-76. [PMID: 29057378 PMCID: PMC5648337 DOI: 10.1016/j.coisb.2017.04.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
As recent advances in human genetics have begun to more rapidly identify the individual genes contributing to risk of psychiatric disease, the spotlight now turns to understanding how disruption of these genes alters the brain, and thus behavior. Compared to other tissues, cellular complexity in the brain provides both a substantial challenge and a significant opportunity for systems biology approaches. Current methods are maturing that will allow for finally defining the 'parts list' for the functioning mouse and human brains, enabling new approaches to defining how the system goes awry in disorders of the CNS. However, the availability of tissue is certainly a challenge for systems biology of neuroscience, compared to systems biology of other tissues, where biopsy is feasible. This challenge is particularly notable for disorders caused by extremely rare genetic variants. Thus computational and systems biology approaches, as well as precise experimental models by way of genome editing, will play key roles in defining mechanisms for disorders, and their individual symptoms, across varied genetic etiologies. Here, we highlight recent progress in neurogenetics, postmortem genomics, cell-type specific profiling, and precision modeling toward defining mechanisms in disease.
Collapse
Affiliation(s)
- Joseph D. Dougherty
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Chengran Yang
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Allison M. Lake
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|