1
|
Uguen K, Michaud JL, Génin E. Short Tandem Repeats in the era of next-generation sequencing: from historical loci to population databases. Eur J Hum Genet 2024:10.1038/s41431-024-01666-z. [PMID: 38982300 DOI: 10.1038/s41431-024-01666-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/11/2024] Open
Abstract
In this study, we explore the landscape of short tandem repeats (STRs) within the human genome through the lens of evolving technologies to detect genomic variations. STRs, which encompass approximately 3% of our genomic DNA, are crucial for understanding human genetic diversity, disease mechanisms, and evolutionary biology. The advent of high-throughput sequencing methods has revolutionized our ability to accurately map and analyze STRs, highlighting their significance in genetic disorders, forensic science, and population genetics. We review the current available methodologies for STR analysis, the challenges in interpreting STR variations across different populations, and the implications of STRs in medical genetics. Our findings underscore the urgent need for comprehensive STR databases that reflect the genetic diversity of global populations, facilitating the interpretation of STR data in clinical diagnostics, genetic research, and forensic applications. This work sets the stage for future studies aimed at harnessing STR variations to elucidate complex genetic traits and diseases, reinforcing the importance of integrating STRs into genetic research and clinical practice.
Collapse
Affiliation(s)
- Kevin Uguen
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.
- Service de Génétique Médicale et Biologie de la Reproduction, CHU de Brest, Brest, France.
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada.
| | - Jacques L Michaud
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
- Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | | |
Collapse
|
2
|
Rajan-Babu IS, Dolzhenko E, Eberle MA, Friedman JM. Sequence composition changes in short tandem repeats: heterogeneity, detection, mechanisms and clinical implications. Nat Rev Genet 2024; 25:476-499. [PMID: 38467784 DOI: 10.1038/s41576-024-00696-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2024] [Indexed: 03/13/2024]
Abstract
Short tandem repeats (STRs) are a class of repetitive elements, composed of tandem arrays of 1-6 base pair sequence motifs, that comprise a substantial fraction of the human genome. STR expansions can cause a wide range of neurological and neuromuscular conditions, known as repeat expansion disorders, whose age of onset, severity, penetrance and/or clinical phenotype are influenced by the length of the repeats and their sequence composition. The presence of non-canonical motifs, depending on the type, frequency and position within the repeat tract, can alter clinical outcomes by modifying somatic and intergenerational repeat stability, gene expression and mutant transcript-mediated and/or protein-mediated toxicities. Here, we review the diverse structural conformations of repeat expansions, technological advances for the characterization of changes in sequence composition, their clinical correlations and the impact on disease mechanisms.
Collapse
Affiliation(s)
- Indhu-Shree Rajan-Babu
- Department of Medical Genetics, The University of British Columbia, and Children's & Women's Hospital, Vancouver, British Columbia, Canada.
| | | | | | - Jan M Friedman
- Department of Medical Genetics, The University of British Columbia, and Children's & Women's Hospital, Vancouver, British Columbia, Canada
- BC Children's Hospital Research Institute, Vancouver, British Columbia, Canada
| |
Collapse
|
3
|
Uppili B, Faruq M. STRIDE-DB: a comprehensive database for exploration of instability and phenotypic relevance of short tandem repeats in the human genome. Database (Oxford) 2024; 2024:baae020. [PMID: 38602506 PMCID: PMC11008502 DOI: 10.1093/database/baae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/10/2023] [Accepted: 03/07/2024] [Indexed: 04/12/2024]
Abstract
Short Tandem Repeats (STRs) are genetic markers made up of repeating DNA sequences. The variations of the STRs are widely studied in forensic analysis, population studies and genetic testing for a variety of neuromuscular disorders. Understanding polymorphic STR variation and its cause is crucial for deciphering genetic information and finding links to various disorders. In this paper, we present STRIDE-DB, a novel and unique platform to explore STR Instability and its Phenotypic Relevance, and a comprehensive database of STRs in the human genome. We utilized RepeatMasker to identify all the STRs in the human genome (hg19) and combined it with frequency data from the 1000 Genomes Project. STRIDE-DB, a user-friendly resource, plays a pivotal role in investigating the relationship between STR variation, instability and phenotype. By harnessing data from genome-wide association studies (GWAS), ClinVar database, Alu loci, Haploblocks in genome and Conservation of the STRs, it serves as an important tool for researchers exploring the variability of STRs in the human genome and its direct impact on phenotypes. STRIDE-DB has its broad applicability and significance in various research domains like forensic sciences and other repeat expansion disorders. Database URL: https://stridedb.igib.res.in.
Collapse
Affiliation(s)
- Bharathram Uppili
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi 110007, India
- CSIR-HRDC Campus, Academy for Scientific and Innovative Research, Ghaziabad 201002, India
| | - Mohammed Faruq
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi 110007, India
| |
Collapse
|
4
|
Oketch JW, Wain LV, Hollox EJ. A comparison of software for analysis of rare and common short tandem repeat (STR) variation using human genome sequences from clinical and population-based samples. PLoS One 2024; 19:e0300545. [PMID: 38558075 PMCID: PMC10984476 DOI: 10.1371/journal.pone.0300545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open
Abstract
Short tandem repeat (STR) variation is an often overlooked source of variation between genomes. STRs comprise about 3% of the human genome and are highly polymorphic. Some cause Mendelian disease, and others affect gene expression. Their contribution to common disease is not well-understood, but recent software tools designed to genotype STRs using short read sequencing data will help address this. Here, we compare software that genotypes common STRs and rarer STR expansions genome-wide, with the aim of applying them to population-scale genomes. By using the Genome-In-A-Bottle (GIAB) consortium and 1000 Genomes Project short-read sequencing data, we compare performance in terms of sequence length, depth, computing resources needed, genotyping accuracy and number of STRs genotyped. To ensure broad applicability of our findings, we also measure genotyping performance against a set of genomes from clinical samples with known STR expansions, and a set of STRs commonly used for forensic identification. We find that HipSTR, ExpansionHunter and GangSTR perform well in genotyping common STRs, including the CODIS 13 core STRs used for forensic analysis. GangSTR and ExpansionHunter outperform HipSTR for genotyping call rate and memory usage. ExpansionHunter denovo (EHdn), STRling and GangSTR outperformed STRetch for detecting expanded STRs, and EHdn and STRling used considerably less processor time compared to GangSTR. Analysis on shared genomic sequence data provided by the GIAB consortium allows future performance comparisons of new software approaches on a common set of data, facilitating comparisons and allowing researchers to choose the best software that fulfils their needs.
Collapse
Affiliation(s)
- John W. Oketch
- Department of Genetics and Genome Biology, University of Leicester, Leicester, United Kingdom
| | - Louise V. Wain
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
| | - Edward J. Hollox
- Department of Genetics and Genome Biology, University of Leicester, Leicester, United Kingdom
| |
Collapse
|
5
|
Xiong J, He Z, Wang L, Fan C, Chao J. DNA Origami-Enabled Gene Localization of Repetitive Sequences. J Am Chem Soc 2024; 146:6317-6325. [PMID: 38391280 DOI: 10.1021/jacs.4c00039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
Repetitive sequences, which make up over 50% of human DNA, have diverse applications in disease diagnosis, forensic identification, paternity testing, and population genetic analysis due to their crucial functions for gene regulation. However, representative detection technologies such as sequencing and fluorescence imaging suffer from time-consuming protocols, high cost, and inaccuracy of the position and order of repetitive sequences. Here, we develop a precise and cost-effective strategy that combines the high resolution of atomic force microscopy with the shape customizability of DNA origami for repetitive sequence-specific gene localization. "Tri-block" DNA structures were specifically designed to connect repetitive sequences to DNA origami tags, thereby revealing precise genetic information in terms of position and sequence for high-resolution and high-precision visualization of repetitive sequences. More importantly, we achieved the results of simultaneous detection of different DNA repetitive sequences on the gene template with a resolution of ∼6.5 nm (19 nt). This strategy is characterized by high efficiency, high precision, low operational complexity, and low labor/time costs, providing a powerful complement to sequencing technologies for gene localization of repetitive sequences.
Collapse
Affiliation(s)
- Jinxin Xiong
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), National Synergetic Innovation Center for Advanced Materials (SICAM), Nanjing University of Posts and Telecommunications, 9 Wenyuan Road, Nanjing 210023, China
| | - Zhimei He
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), National Synergetic Innovation Center for Advanced Materials (SICAM), Nanjing University of Posts and Telecommunications, 9 Wenyuan Road, Nanjing 210023, China
| | - Lianhui Wang
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), National Synergetic Innovation Center for Advanced Materials (SICAM), Nanjing University of Posts and Telecommunications, 9 Wenyuan Road, Nanjing 210023, China
| | - Chunhai Fan
- School of Chemistry and Chemical Engineering, New Cornerstone Science Laboratory, Frontiers Science Center for Transformative Molecules, Zhangjiang Institute for Advanced Study and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, China
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acids Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200127, China
| | - Jie Chao
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), National Synergetic Innovation Center for Advanced Materials (SICAM), Nanjing University of Posts and Telecommunications, 9 Wenyuan Road, Nanjing 210023, China
| |
Collapse
|
6
|
Hannan AJ. Repeating themes of plastic genes and therapeutic schemes targeting the 'tandem repeatome'. Brain Commun 2024; 6:fcae047. [PMID: 38449715 PMCID: PMC10917440 DOI: 10.1093/braincomms/fcae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 01/24/2024] [Accepted: 02/17/2024] [Indexed: 03/08/2024] Open
Abstract
This scientific commentary refers to 'Modification of Huntington's disease by short tandem repeats' by Hong et al. (https://doi.org/10.1093/braincomms/fcae016) in Brain Communications.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Australia
- Department of Anatomy and Physiology, University of Melbourne, Parkville, Australia
| |
Collapse
|
7
|
Wu K, Bu F, Wu Y, Zhang G, Wang X, He S, Liu MF, Chen R, Yuan H. Exploring noncoding variants in genetic diseases: from detection to functional insights. J Genet Genomics 2024; 51:111-132. [PMID: 38181897 DOI: 10.1016/j.jgg.2024.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 12/26/2023] [Accepted: 01/01/2024] [Indexed: 01/07/2024]
Abstract
Previous studies on genetic diseases predominantly focused on protein-coding variations, overlooking the vast noncoding regions in the human genome. The development of high-throughput sequencing technologies and functional genomics tools has enabled the systematic identification of functional noncoding variants. These variants can impact gene expression, regulation, and chromatin conformation, thereby contributing to disease pathogenesis. Understanding the mechanisms that underlie the impact of noncoding variants on genetic diseases is indispensable for the development of precisely targeted therapies and the implementation of personalized medicine strategies. The intricacies of noncoding regions introduce a multitude of challenges and research opportunities. In this review, we introduce a spectrum of noncoding variants involved in genetic diseases, along with research strategies and advanced technologies for their precise identification and in-depth understanding of the complexity of the noncoding genome. We will delve into the research challenges and propose potential solutions for unraveling the genetic basis of rare and complex diseases.
Collapse
Affiliation(s)
- Ke Wu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, Sichuan 610041, China
| | - Fengxiao Bu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, Sichuan 610041, China
| | - Yang Wu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, Sichuan 610041, China
| | - Gen Zhang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, Sichuan 610041, China
| | - Xin Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, Zhejiang 310024, China
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mo-Fang Liu
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, Zhejiang 310024, China; State Key Laboratory of Molecular Biology, State Key Laboratory of Cell Biology, Shanghai Key Laboratory of Molecular Andrology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China.
| | - Runsheng Chen
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Huijun Yuan
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, Sichuan 610041, China.
| |
Collapse
|
8
|
Hong EP, Ramos EM, Aziz NA, Massey TH, McAllister B, Lobanov S, Jones L, Holmans P, Kwak S, Orth M, Ciosi M, Lomeikaite V, Monckton DG, Long JD, Lucente D, Wheeler VC, Gillis T, MacDonald ME, Sequeiros J, Gusella JF, Lee JM. Modification of Huntington's disease by short tandem repeats. Brain Commun 2024; 6:fcae016. [PMID: 38449714 PMCID: PMC10917446 DOI: 10.1093/braincomms/fcae016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/20/2023] [Accepted: 01/22/2024] [Indexed: 03/08/2024] Open
Abstract
Expansions of glutamine-coding CAG trinucleotide repeats cause a number of neurodegenerative diseases, including Huntington's disease and several of spinocerebellar ataxias. In general, age-at-onset of the polyglutamine diseases is inversely correlated with the size of the respective inherited expanded CAG repeat. Expanded CAG repeats are also somatically unstable in certain tissues, and age-at-onset of Huntington's disease corrected for individual HTT CAG repeat length (i.e. residual age-at-onset), is modified by repeat instability-related DNA maintenance/repair genes as demonstrated by recent genome-wide association studies. Modification of one polyglutamine disease (e.g. Huntington's disease) by the repeat length of another (e.g. ATXN3, CAG expansions in which cause spinocerebellar ataxia 3) has also been hypothesized. Consequently, we determined whether age-at-onset in Huntington's disease is modified by the CAG repeats of other polyglutamine disease genes. We found that the CAG measured repeat sizes of other polyglutamine disease genes that were polymorphic in Huntington's disease participants but did not influence Huntington's disease age-at-onset. Additional analysis focusing specifically on ATXN3 in a larger sample set (n = 1388) confirmed the lack of association between Huntington's disease residual age-at-onset and ATXN3 CAG repeat length. Additionally, neither our Huntington's disease onset modifier genome-wide association studies single nucleotide polymorphism data nor imputed short tandem repeat data supported the involvement of other polyglutamine disease genes in modifying Huntington's disease. By contrast, our genome-wide association studies based on imputed short tandem repeats revealed significant modification signals for other genomic regions. Together, our short tandem repeat genome-wide association studies show that modification of Huntington's disease is associated with short tandem repeats that do not involve other polyglutamine disease-causing genes, refining the landscape of Huntington's disease modification and highlighting the importance of rigorous data analysis, especially in genetic studies testing candidate modifiers.
Collapse
Affiliation(s)
- Eun Pyo Hong
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Eliana Marisa Ramos
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - N Ahmad Aziz
- Population & Clinical Neuroepidemiology, German Center for Neurodegenerative Diseases, 53127 Bonn, Germany
- Department of Neurology, Faculty of Medicine, University of Bonn, Bonn D-53113, Germany
| | - Thomas H Massey
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Branduff McAllister
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Sergey Lobanov
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Lesley Jones
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Peter Holmans
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Seung Kwak
- Molecular System Biology, CHDI Foundation, Princeton, NJ 08540, USA
| | - Michael Orth
- University Hospital of Old Age Psychiatry and Psychotherapy, Bern University, CH-3000 Bern 60, Switzerland
| | - Marc Ciosi
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Vilija Lomeikaite
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Darren G Monckton
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Jeffrey D Long
- Department of Psychiatry, Carver College of Medicine and Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA 52242, USA
| | - Diane Lucente
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Vanessa C Wheeler
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Tammy Gillis
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Marcy E MacDonald
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Jorge Sequeiros
- UnIGENe, IBMC—Institute for Molecular and Cell Biology, i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto 420-135, Portugal
- ICBAS School of Medicine and Biomedical Sciences, University of Porto, Porto 420-135, Portugal
| | - James F Gusella
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Jong-Min Lee
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
9
|
Manigbas CA, Jadhav B, Garg P, Shadrina M, Lee W, Martin-Trujillo A, Sharp AJ. A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.22.24301630. [PMID: 38343850 PMCID: PMC10854328 DOI: 10.1101/2024.01.22.24301630] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Most genetic association studies focus on binary variants. To identify the effects of multi-allelic variation of tandem repeats (TRs) on human traits, we performed direct TR genotyping and phenome-wide association studies in 168,554 individuals from the UK Biobank, identifying 47 TRs showing causal associations with 73 traits. We replicated 23 of 31 (74%) of these causal associations in the All of Us cohort. While this set included several known repeat expansion disorders, novel associations we found were attributable to common polymorphic variation in TR length rather than rare expansions and include e.g. a coding polyhistidine motif in HRCT1 influencing risk of hypertension and a poly(CGC) in the 5'UTR of GNB2 influencing heart rate. Causal TRs were strongly enriched for associations with local gene expression and DNA methylation. Our study highlights the contribution of multi-allelic TRs to the "missing heritability" of the human genome.
Collapse
|
10
|
Margoliash J, Fuchs S, Li Y, Zhang X, Massarat A, Goren A, Gymrek M. Polymorphic short tandem repeats make widespread contributions to blood and serum traits. CELL GENOMICS 2023; 3:100458. [PMID: 38116119 PMCID: PMC10726533 DOI: 10.1016/j.xgen.2023.100458] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 09/09/2023] [Accepted: 11/07/2023] [Indexed: 12/21/2023]
Abstract
Short tandem repeats (STRs) are genomic regions consisting of repeated sequences of 1-6 bp in succession. Single-nucleotide polymorphism (SNP)-based genome-wide association studies (GWASs) do not fully capture STR effects. To study these effects, we imputed 445,720 STRs into genotype arrays from 408,153 White British UK Biobank participants and tested for association with 44 blood phenotypes. Using two fine-mapping methods, we identify 119 candidate causal STR-trait associations and estimate that STRs account for 5.2%-7.6% of causal variants identifiable from GWASs for these traits. These are among the strongest associations for multiple phenotypes, including a coding CTG repeat associated with apolipoprotein B levels, a promoter CGG repeat with platelet traits, and an intronic poly(A) repeat with mean platelet volume. Our study suggests that STRs make widespread contributions to complex traits, provides stringently selected candidate causal STRs, and demonstrates the need to consider a more complete view of genetic variation in GWASs.
Collapse
Affiliation(s)
- Jonathan Margoliash
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Shai Fuchs
- Pediatric Endocrine and Diabetes Unit, Edmond and Lily Safra Children’s Hospital, Sheba Medical Center, Ramat Gan, Israel
| | - Yang Li
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Xuan Zhang
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Arya Massarat
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Alon Goren
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
11
|
Hannan AJ. Expanding horizons of tandem repeats in biology and medicine: Why 'genomic dark matter' matters. Emerg Top Life Sci 2023; 7:ETLS20230075. [PMID: 38088823 PMCID: PMC10754335 DOI: 10.1042/etls20230075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023]
Abstract
Approximately half of the human genome includes repetitive sequences, and these DNA sequences (as well as their transcribed repetitive RNA and translated amino-acid repeat sequences) are known as the repeatome. Within this repeatome there are a couple of million tandem repeats, dispersed throughout the genome. These tandem repeats have been estimated to constitute ∼8% of the entire human genome. These tandem repeats can be located throughout exons, introns and intergenic regions, thus potentially affecting the structure and function of tandemly repetitive DNA, RNA and protein sequences. Over more than three decades, more than 60 monogenic human disorders have been found to be caused by tandem-repeat mutations. These monogenic tandem-repeat disorders include Huntington's disease, a variety of ataxias, amyotrophic lateral sclerosis and frontotemporal dementia, as well as many other neurodegenerative diseases. Furthermore, tandem-repeat disorders can include fragile X syndrome, related fragile X disorders, as well as other neurological and psychiatric disorders. However, these monogenic tandem-repeat disorders, which were discovered via their dominant or recessive modes of inheritance, may represent the 'tip of the iceberg' with respect to tandem-repeat contributions to human disorders. A previous proposal that tandem repeats may contribute to the 'missing heritability' of various common polygenic human disorders has recently been supported by a variety of new evidence. This includes genome-wide studies that associate tandem-repeat mutations with autism, schizophrenia, Parkinson's disease and various types of cancers. In this article, I will discuss how tandem-repeat mutations and polymorphisms could contribute to a wide range of common disorders, along with some of the many major challenges of tandem-repeat biology and medicine. Finally, I will discuss the potential of tandem repeats to be therapeutically targeted, so as to prevent and treat an expanding range of human disorders.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Victoria 3010, Australia
- Department of Anatomy and Physiology, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
12
|
Horton CA, Alexandari AM, Hayes MGB, Marklund E, Schaepe JM, Aditham AK, Shah N, Suzuki PH, Shrikumar A, Afek A, Greenleaf WJ, Gordân R, Zeitlinger J, Kundaje A, Fordyce PM. Short tandem repeats bind transcription factors to tune eukaryotic gene expression. Science 2023; 381:eadd1250. [PMID: 37733848 DOI: 10.1126/science.add1250] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/26/2023] [Indexed: 09/23/2023]
Abstract
Short tandem repeats (STRs) are enriched in eukaryotic cis-regulatory elements and alter gene expression, yet how they regulate transcription remains unknown. We found that STRs modulate transcription factor (TF)-DNA affinities and apparent on-rates by about 70-fold by directly binding TF DNA-binding domains, with energetic impacts exceeding many consensus motif mutations. STRs maximize the number of weakly preferred microstates near target sites, thereby increasing TF density, with impacts well predicted by statistical mechanics. Confirming that STRs also affect TF binding in cells, neural networks trained only on in vivo occupancies predicted effects identical to those observed in vitro. Approximately 90% of TFs preferentially bound STRs that need not resemble known motifs, providing a cis-regulatory mechanism to target TFs to genomic sites.
Collapse
Affiliation(s)
- Connor A Horton
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Amr M Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Michael G B Hayes
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Emil Marklund
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Julia M Schaepe
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Arjun K Aditham
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Peter H Suzuki
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Ariel Afek
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | | | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Computer Science, Duke University, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC 27710, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
- The University of Kansas Medical Center, Kansas City, KS 66103, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Polly M Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94110, USA
| |
Collapse
|
13
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
14
|
Martin-Trujillo A, Garg P, Patel N, Jadhav B, Sharp AJ. Genome-wide evaluation of the effect of short tandem repeat variation on local DNA methylation. Genome Res 2023; 33:184-196. [PMID: 36577521 PMCID: PMC10069470 DOI: 10.1101/gr.277057.122] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 12/19/2022] [Indexed: 12/30/2022]
Abstract
Short tandem repeats (STRs) contribute significantly to genetic diversity in humans, including disease-causing variation. Although the effect of STR variation on gene expression has been extensively assessed, their impact on epigenetics has been poorly studied and limited to specific genomic regions. Here, we investigated the hypothesis that some STRs act as independent regulators of local DNA methylation in the human genome and modify risk of common human traits. To address these questions, we first analyzed two independent data sets comprising PCR-free whole-genome sequencing (WGS) and genome-wide DNA methylation levels derived from whole-blood samples in 245 (discovery cohort) and 484 individuals (replication cohort). Using genotypes for 131,635 polymorphic STRs derived from WGS using HipSTR, we identified 11,870 STRs that associated with DNA methylation levels (mSTRs) of 11,774 CpGs (Bonferroni P < 0.001) in our discovery cohort, with 90% successfully replicating in our second cohort. Subsequently, through fine-mapping using CAVIAR we defined 585 of these mSTRs as the likely causal variants underlying the observed associations (fm-mSTRs) and linked a fraction of these to previously reported genome-wide association study signals, providing insights into the mechanisms underlying complex human traits. Furthermore, by integrating gene expression data, we observed that 12.5% of the tested fm-mSTRs also modulate expression levels of nearby genes, reinforcing their regulatory potential. Overall, our findings expand the catalog of functional sequence variants that affect genome regulation, highlighting the importance of incorporating STRs in future genetic association analysis and epigenetics data for the interpretation of trait-associated variants.
Collapse
Affiliation(s)
- Alejandro Martin-Trujillo
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, New York 10029, USA
| | - Paras Garg
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, New York 10029, USA
| | - Nihir Patel
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, New York 10029, USA
| | - Bharati Jadhav
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, New York 10029, USA
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, New York 10029, USA
| |
Collapse
|
15
|
Characterization of the microsatellite landscape provides insights into the evolutionary dynamics of the mammals based on the chromosome-level genomes. Gene X 2023; 851:146965. [DOI: 10.1016/j.gene.2022.146965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/18/2022] [Accepted: 10/11/2022] [Indexed: 11/27/2022] Open
|
16
|
de Boer I, Harder AVE, Ferrari MD, van den Maagdenberg AMJM, Terwindt GM. Genetics of migraine: Delineation of contemporary understanding of the genetic underpinning of migraine. HANDBOOK OF CLINICAL NEUROLOGY 2023; 198:85-103. [PMID: 38043973 DOI: 10.1016/b978-0-12-823356-6.00012-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Migraine is a disabling episodic brain disorder with an increased familial relative risk, an increased concordance in monozygotic twins, and an estimated heritability of approximately 50%. Various genetic approaches have been applied to identify genetic factors conferring migraine risk. Initially, candidate gene associations studies (CGAS) have been performed that test DNA variants in genes prioritized based on presumed a priori knowledge of migraine pathophysiology. More recently, genome-wide association studies (GWAS) are applied that test genetic variants, single-nucleotide polymorphisms (SNPs), in a hypothesis-free manner. To date, GWAS have identified ~40 genetic loci associated with migraine. New GWAS data, which are expected to come out soon, will reveal over 100 loci. Also, large-scale GWAS, which have appeared for many traits over the last decade, have enabled studying the overlap in genetic architecture between migraine and its comorbid disorders. Importantly, other genetic factors that cannot be identified by a GWAS approach also confer risk for migraine. First steps have been taken to determine the contribution of these mechanisms by investigating mitochondrial DNA and epigenetic mechanisms. In addition to typical epigenetic mechanisms, that is, DNA methylation and histone modifications, also RNA-based mechanisms regulating gene silencing and activation have recently gotten attention. Regardless, until now, most relevant genetic discoveries related to migraine still come from investigating monogenetic syndromes with migraine as a prominent part of the phenotype. Experimental studies on these syndromes have expanded our knowledge on the mechanisms underlying migraine pathophysiology. It can be envisaged that when all (epi)genetic and phenotypic data on the common and rare forms of migraine will be integrated, this will help to unravel the biological mechanisms for migraine, which will likely guide decision-making in clinical practice in the future.
Collapse
Affiliation(s)
- Irene de Boer
- Department of Neurology, Leiden University Medical Center, Leiden, The Netherlands
| | - Aster V E Harder
- Department of Neurology, Leiden University Medical Center, Leiden, The Netherlands; Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Michel D Ferrari
- Department of Neurology, Leiden University Medical Center, Leiden, The Netherlands
| | - Arn M J M van den Maagdenberg
- Department of Neurology, Leiden University Medical Center, Leiden, The Netherlands; Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Gisela M Terwindt
- Department of Neurology, Leiden University Medical Center, Leiden, The Netherlands.
| |
Collapse
|
17
|
Recurrent repeat expansions in human cancer genomes. Nature 2023; 613:96-102. [PMID: 36517591 DOI: 10.1038/s41586-022-05515-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 11/02/2022] [Indexed: 12/16/2022]
Abstract
Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases1,2. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer3-8. Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.
Collapse
|
18
|
Younger DS. Neurogenetic motor disorders. HANDBOOK OF CLINICAL NEUROLOGY 2023; 195:183-250. [PMID: 37562870 DOI: 10.1016/b978-0-323-98818-6.00003-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Advances in the field of neurogenetics have practical applications in rapid diagnosis on blood and body fluids to extract DNA, obviating the need for invasive investigations. The ability to obtain a presymptomatic diagnosis through genetic screening and biomarkers can be a guide to life-saving disease-modifying therapy or enzyme replacement therapy to compensate for the deficient disease-causing enzyme. The benefits of a comprehensive neurogenetic evaluation extend to family members in whom identification of the causal gene defect ensures carrier detection and at-risk counseling for future generations. This chapter explores the many facets of the neurogenetic evaluation in adult and pediatric motor disorders as a primer for later chapters in this volume and a roadmap for the future applications of genetics in neurology.
Collapse
Affiliation(s)
- David S Younger
- Department of Clinical Medicine and Neuroscience, CUNY School of Medicine, New York, NY, United States; Department of Medicine, Section of Internal Medicine and Neurology, White Plains Hospital, White Plains, NY, United States.
| |
Collapse
|
19
|
Wendt FR, Pathak GA, Polimanti R. Phenome-wide association study of loci harboring de novo tandem repeat mutations in UK Biobank exomes. Nat Commun 2022; 13:7682. [PMID: 36509785 PMCID: PMC9744822 DOI: 10.1038/s41467-022-35423-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 12/02/2022] [Indexed: 12/15/2022] Open
Abstract
When present in coding regions, tandem repeats (TRs) may have large effects on protein structure and function contributing to health and disease. We use a family-based design to identify de novo TRs and assess their impact at the population level in 148,607 European ancestry participants from the UK Biobank. The 427 loci with de novo TR mutations are enriched for targets of microRNA-184 (21.1-fold, P = 4.30 × 10-5, FDR = 9.50 × 10-3). There are 123 TR-phenotype associations with posterior probabilities > 0.95. These relate to body structure, cognition, and cardiovascular, metabolic, psychiatric, and respiratory outcomes. We report several loci with large likely causal effects on tissue microstructure, including the FAN1-[TG]N and carotid intima-media thickness (mean thickness: beta = 5.22, P = 1.22 × 10-6, FDR = 0.004; maximum thickness: beta = 6.44, P = 1.12 × 10-6, FDR = 0.004). Two exonic repeats FNBP4-[GGT]N and BTN2A1-[CCT]N alter protein structure. In this work, we contribute clear and testable hypotheses of dose-dependent TR implications linking genetic variation and protein structure with health and disease outcomes.
Collapse
Affiliation(s)
- Frank R Wendt
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada.
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada.
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.
- VA CT Healthcare System, West Haven, CT, USA.
| | - Gita A Pathak
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- VA CT Healthcare System, West Haven, CT, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- VA CT Healthcare System, West Haven, CT, USA
| |
Collapse
|
20
|
Pholtaisong J, Chaiyaratana N, Aporntewan C, Mutirangura A. Mononucleotide A-repeats may Play a Regulatory Role in Endothermic Housekeeping Genes. Evol Bioinform Online 2022; 18:11769343221110656. [PMID: 35860694 PMCID: PMC9290108 DOI: 10.1177/11769343221110656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 07/02/2022] [Indexed: 11/24/2022] Open
Abstract
Background: Coding and non-coding short tandem repeats (STRs) facilitate a great diversity of phenotypic traits. The imbalance of mononucleotide A-repeats around transcription start sites (TSSs) was found in 3 mammals: H. sapiens, M. musculus, and R. norvegicus. Principal Findings: We found that the imbalance pattern originated in some vertebrates. A similar pattern was observed in mammals and birds, but not in amphibians and reptiles. We proposed that the enriched A-repeats upstream of TSSs is a novel hallmark of endotherms or warm-blooded animals. Gene ontology analysis indicates that the primary function of upstream A-repeats involves metabolism, cellular transportation, and sensory perception (smell and chemical stimulus) through housekeeping genes. Conclusions: Upstream A-repeats may play a regulatory role in the metabolic process of endothermic animals.
Collapse
Affiliation(s)
- Jatuphol Pholtaisong
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| | - Nachol Chaiyaratana
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, Bangkok, Thailand.,Division of Medical Genetics Research and Laboratory, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Chatchawit Aporntewan
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Pathumwan, Bangkok, Thailand.,Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Pathumwan, Bangkok, Thailand.,Omics Sciences and Bioinformatics Center, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| | - Apiwat Mutirangura
- Center of Excellence in Molecular Genetics of Cancer and Human Diseases, Department of Anatomy, Faculty of Medicine, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| |
Collapse
|
21
|
Mei H, Zhao T, Dong Z, Han J, Xu B, Chen R, Zhang J, Zhang J, Hu Y, Zhang T, Fang L. Population-Scale Polymorphic Short Tandem Repeat Provides an Alternative Strategy for Allele Mining in Cotton. FRONTIERS IN PLANT SCIENCE 2022; 13:916830. [PMID: 35599867 PMCID: PMC9120961 DOI: 10.3389/fpls.2022.916830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 04/20/2022] [Indexed: 06/15/2023]
Abstract
Short tandem repeats (STRs), which vary in size due to featuring variable numbers of repeat units, are present throughout most eukaryotic genomes. To date, few population-scale studies identifying STRs have been reported for crops. Here, we constructed a high-density polymorphic STR map by investigating polymorphic STRs from 911 Gossypium hirsutum accessions. In total, we identified 556,426 polymorphic STRs with an average length of 21.1 bp, of which 69.08% were biallelic. Moreover, 7,718 (1.39%) were identified in the exons of 6,021 genes, which were significantly enriched in transcription, ribosome biogenesis, and signal transduction. Only 5.88% of those exonic STRs altered open reading frames, of which 97.16% were trinucleotide. An alternative strategy STR-GWAS analysis revealed that 824 STRs were significantly associated with agronomic traits, including 491 novel alleles that undetectable by previous SNP-GWAS methods. For instance, a novel polymorphic STR consisting of GAACCA repeats was identified in GH_D06G1697, with its (GAACCA)5 allele increasing fiber length by 1.96-4.83% relative to the (GAACCA)4 allele. The database CottonSTRDB was further developed to facilitate use of STR datasets in breeding programs. Our study provides functional roles for STRs in influencing complex traits, an alternative strategy STR-GWAS for allele mining, and a database serving the cotton community as a valuable resource.
Collapse
Affiliation(s)
- Huan Mei
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Ting Zhao
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Zeyu Dong
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Jin Han
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Biyu Xu
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Rui Chen
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Jun Zhang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Juncheng Zhang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
| | - Yan Hu
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Tianzhen Zhang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Lei Fang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| |
Collapse
|
22
|
Liu Z, Zhao G, Xiao Y, Zeng S, Yuan Y, Zhou X, Fang Z, He R, Li B, Zhao Y, Pan H, Wang Y, Yu G, Peng IF, Wang D, Meng Q, Xu Q, Sun Q, Yan X, Shen L, Jiang H, Xia K, Wang J, Guo J, Liang F, Li J, Tang B. Profiling the Genome-Wide Landscape of Short Tandem Repeats by Long-Read Sequencing. Front Genet 2022; 13:810595. [PMID: 35601492 PMCID: PMC9117641 DOI: 10.3389/fgene.2022.810595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 03/30/2022] [Indexed: 11/23/2022] Open
Abstract
Background: Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases and the regulation of gene expression. Long-read sequencing (LRS) offers a potential solution to genome-wide STR analysis. However, characterizing STRs in human genomes using LRS on a large population scale has not been reported. Methods: We conducted the large LRS-based STR analysis in 193 unrelated samples of the Chinese population and performed genome-wide profiling of STR variation in the human genome. The repeat dynamic index (RDI) was introduced to evaluate the variability of STR. We sourced the expression data from the Genotype-Tissue Expression to explore the tissue specificity of highly variable STRs related genes across tissues. Enrichment analyses were also conducted to identify potential functional roles of the high variable STRs. Results: This study reports the large-scale analysis of human STR variation by LRS and offers a reference STR database based on the LRS dataset. We found that the disease-associated STRs (dSTRs) and STRs associated with the expression of nearby genes (eSTRs) were highly variable in the general population. Moreover, tissue-specific expression analysis showed that those highly variable STRs related genes presented the highest expression level in brain tissues, and enrichment pathways analysis found those STRs are involved in synaptic function-related pathways. Conclusion: Our study profiled the genome-wide landscape of STR using LRS and highlighted the highly variable STRs in the human genome, which provide a valuable resource for studying the role of STRs in human disease and complex traits.
Collapse
Affiliation(s)
- Zhenhua Liu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | | | - Sheng Zeng
- Department of Geriatrics, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Yanchun Yuan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Xun Zhou
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhenghuan Fang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Runcheng He
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Yuwen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Hongxu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | | | | | | | - Qingtuan Meng
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
| | - Qian Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qiying Sun
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Xinxiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Lu Shen
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| | - Kun Xia
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Junling Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Jifeng Guo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Jinchen Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Beisha Tang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| |
Collapse
|
23
|
Barbé L, Finkbeiner S. Genetic and Epigenetic Interplay Define Disease Onset and Severity in Repeat Diseases. Front Aging Neurosci 2022; 14:750629. [PMID: 35592702 PMCID: PMC9110800 DOI: 10.3389/fnagi.2022.750629] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 03/01/2022] [Indexed: 11/13/2022] Open
Abstract
Repeat diseases, such as fragile X syndrome, myotonic dystrophy, Friedreich ataxia, Huntington disease, spinocerebellar ataxias, and some forms of amyotrophic lateral sclerosis, are caused by repetitive DNA sequences that are expanded in affected individuals. The age at which an individual begins to experience symptoms, and the severity of disease, are partially determined by the size of the repeat. However, the epigenetic state of the area in and around the repeat also plays an important role in determining the age of disease onset and the rate of disease progression. Many repeat diseases share a common epigenetic pattern of increased methylation at CpG islands near the repeat region. CpG islands are CG-rich sequences that are tightly regulated by methylation and are often found at gene enhancer or insulator elements in the genome. Methylation of CpG islands can inhibit binding of the transcriptional regulator CTCF, resulting in a closed chromatin state and gene down regulation. The downregulation of these genes leads to some disease-specific symptoms. Additionally, a genetic and epigenetic interplay is suggested by an effect of methylation on repeat instability, a hallmark of large repeat expansions that leads to increasing disease severity in successive generations. In this review, we will discuss the common epigenetic patterns shared across repeat diseases, how the genetics and epigenetics interact, and how this could be involved in disease manifestation. We also discuss the currently available stem cell and mouse models, which frequently do not recapitulate epigenetic patterns observed in human disease, and propose alternative strategies to study the role of epigenetics in repeat diseases.
Collapse
Affiliation(s)
- Lise Barbé
- Center for Systems and Therapeutics, Gladstone Institutes, San Francisco, CA, United States
- Department of Neurology, University of California, San Francisco, San Francisco, CA, United States
- Department of Physiology, University of California, San Francisco, San Francisco, CA, United States
| | - Steve Finkbeiner
- Center for Systems and Therapeutics, Gladstone Institutes, San Francisco, CA, United States
- Department of Neurology, University of California, San Francisco, San Francisco, CA, United States
- Department of Physiology, University of California, San Francisco, San Francisco, CA, United States
- *Correspondence: Steve Finkbeiner,
| |
Collapse
|
24
|
Gall-Duncan T, Sato N, Yuen RKC, Pearson CE. Advancing genomic technologies and clinical awareness accelerates discovery of disease-associated tandem repeat sequences. Genome Res 2022; 32:1-27. [PMID: 34965938 PMCID: PMC8744678 DOI: 10.1101/gr.269530.120] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 11/29/2021] [Indexed: 11/25/2022]
Abstract
Expansions of gene-specific DNA tandem repeats (TRs), first described in 1991 as a disease-causing mutation in humans, are now known to cause >60 phenotypes, not just disease, and not only in humans. TRs are a common form of genetic variation with biological consequences, observed, so far, in humans, dogs, plants, oysters, and yeast. Repeat diseases show atypical clinical features, genetic anticipation, and multiple and partially penetrant phenotypes among family members. Discovery of disease-causing repeat expansion loci accelerated through technological advances in DNA sequencing and computational analyses. Between 2019 and 2021, 17 new disease-causing TR expansions were reported, totaling 63 TR loci (>69 diseases), with a likelihood of more discoveries, and in more organisms. Recent and historical lessons reveal that properly assessed clinical presentations, coupled with genetic and biological awareness, can guide discovery of disease-causing unstable TRs. We highlight critical but underrecognized aspects of TR mutations. Repeat motifs may not be present in current reference genomes but will be in forthcoming gapless long-read references. Repeat motif size can be a single nucleotide to kilobases/unit. At a given locus, repeat motif sequence purity can vary with consequence. Pathogenic repeats can be "insertions" within nonpathogenic TRs. Expansions, contractions, and somatic length variations of TRs can have clinical/biological consequences. TR instabilities occur in humans and other organisms. TRs can be epigenetically modified and/or chromosomal fragile sites. We discuss the expanding field of disease-associated TR instabilities, highlighting prospects, clinical and genetic clues, tools, and challenges for further discoveries of disease-causing TR instabilities and understanding their biological and pathological impacts-a vista that is about to expand.
Collapse
Affiliation(s)
- Terence Gall-Duncan
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Nozomu Sato
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
| | - Ryan K C Yuen
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Christopher E Pearson
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1L7, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
25
|
Xiao X, Zhang CY, Zhang Z, Hu Z, Li M, Li T. Revisiting tandem repeats in psychiatric disorders from perspectives of genetics, physiology, and brain evolution. Mol Psychiatry 2022; 27:466-475. [PMID: 34650204 DOI: 10.1038/s41380-021-01329-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 09/16/2021] [Accepted: 09/28/2021] [Indexed: 01/28/2023]
Abstract
Genome-wide association studies (GWASs) have revealed substantial genetic components comprised of single nucleotide polymorphisms (SNPs) in the heritable risk of psychiatric disorders. However, genetic risk factors not covered by GWAS also play pivotal roles in these illnesses. Tandem repeats, which are likely functional but frequently overlooked by GWAS, may account for an important proportion in the "missing heritability" of psychiatric disorders. Despite difficulties in characterizing and quantifying tandem repeats in the genome, studies have been carried out in an attempt to describe impact of tandem repeats on gene regulation and human phenotypes. In this review, we have introduced recent research progress regarding the genomic distribution and regulatory mechanisms of tandem repeats. We have also summarized the current knowledge of the genetic architecture and biological underpinnings of psychiatric disorders brought by studies of tandem repeats. These findings suggest that tandem repeats, in candidate psychiatric risk genes or in different levels of linkage disequilibrium (LD) with psychiatric GWAS SNPs and haplotypes, may modulate biological phenotypes related to psychiatric disorders (e.g., cognitive function and brain physiology) through regulating alternative splicing, promoter activity, enhancer activity and so on. In addition, many tandem repeats undergo tight natural selection in the human lineage, and likely exert crucial roles in human brain evolution. Taken together, the putative roles of tandem repeats in the pathogenesis of psychiatric disorders is strongly implicated, and using examples from previous literatures, we wish to call for further attention to tandem repeats in the post-GWAS era of psychiatric disorders.
Collapse
Affiliation(s)
- Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Chu-Yi Zhang
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Zhuohua Zhang
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China.,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Zhonghua Hu
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Department of Critical Care Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Eye Center of Xiangya Hospital and Hunan Key Laboratory of Ophthalmology, Central South University, Changsha, Hunan, China. .,National Clinical Research Center on Mental Disorders, Changsha, Hunan, China.
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China. .,CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China. .,KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Tao Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China. .,Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Guangzhou, China.
| |
Collapse
|
26
|
Genome-wide tandem repeat expansions contribute to schizophrenia risk. Mol Psychiatry 2022; 27:3692-3698. [PMID: 35546631 PMCID: PMC9708556 DOI: 10.1038/s41380-022-01575-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 04/07/2022] [Accepted: 04/12/2022] [Indexed: 02/08/2023]
Abstract
Tandem repeat expansions (TREs) can cause neurological diseases but their impact in schizophrenia is unclear. Here we analyzed genome sequences of adults with schizophrenia and found that they have a higher burden of TREs that are near exons and rare in the general population, compared with non-psychiatric controls. These TREs are disproportionately found at loci known to be associated with schizophrenia from genome-wide association studies, in individuals with clinically-relevant genetic variants at other schizophrenia loci, and in families where multiple individuals have schizophrenia. We showed that rare TREs in schizophrenia may impact synaptic functions by disrupting the splicing process of their associated genes in a loss-of-function manner. Our findings support the involvement of genome-wide rare TREs in the polygenic nature of schizophrenia.
Collapse
|
27
|
Schröder C, Horsthemke B, Depienne C. GC-rich repeat expansions: associated disorders and mechanisms. MED GENET-BERLIN 2021; 33:325-335. [PMID: 38835438 PMCID: PMC11006399 DOI: 10.1515/medgen-2021-2099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 11/12/2021] [Indexed: 06/06/2024]
Abstract
Noncoding repeat expansions are a well-known cause of genetic disorders mainly affecting the central nervous system. Missed by most standard technologies used in routine diagnosis, pathogenic noncoding repeat expansions have to be searched for using specific techniques such as repeat-primed PCR or specific bioinformatics tools applied to genome data, such as ExpansionHunter. In this review, we focus on GC-rich repeat expansions, which represent at least one third of all noncoding repeat expansions described so far. GC-rich expansions are mainly located in regulatory regions (promoter, 5' untranslated region, first intron) of genes and can lead to either a toxic gain-of-function mediated by RNA toxicity and/or repeat-associated non-AUG (RAN) translation, or a loss-of-function of the associated gene, depending on their size and their methylation status. We herein review the clinical and molecular characteristics of disorders associated with these difficult-to-detect expansions.
Collapse
Affiliation(s)
- Christopher Schröder
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Bernhard Horsthemke
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Christel Depienne
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| |
Collapse
|
28
|
van der Plas E, Schultz JL, Nopoulos PC. The Neurodevelopmental Hypothesis of Huntington's Disease. J Huntingtons Dis 2021; 9:217-229. [PMID: 32925079 PMCID: PMC7683043 DOI: 10.3233/jhd-200394] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The current dogma of HD pathoetiology posits it is a degenerative disease affecting primarily the striatum, caused by a gain of function (toxicity) of the mutant mHTT that kills neurons. However, a growing body of evidence supports an alternative theory in which loss of function may also influence the pathology.This theory is predicated on the notion that HTT is known to be a vital gene for brain development. mHTT is expressed throughout life and could conceivably have deleterious effects on brain development. The end event in the disease is, of course, neurodegeneration; however the process by which that occurs may be rooted in the pathophysiology of aberrant development.To date, there have been multiple studies evaluating molecular and cellular mechanisms of abnormal development in HD, as well as studies investigating abnormal brain development in HD animal models. However, direct study of how mHTT could affect neurodevelopment in humans has not been approached until recent years. The current review will focus on the most recent findings of a unique study of children at-risk for HD, the Kids-HD study. This study evaluates brain structure and function in children ages 6-18 years old who are at risk for HD (have a parent or grand-parent with HD).
Collapse
Affiliation(s)
- Ellen van der Plas
- University of Iowa Carver College of Medicine, Department of Psychiatry, Iowa City, IA, USA
| | - Jordan L Schultz
- University of Iowa Carver College of Medicine, Department of Psychiatry, Iowa City, IA, USA
| | - Peg C Nopoulos
- University of Iowa Carver College of Medicine, Department of Psychiatry, Iowa City, IA, USA
| |
Collapse
|
29
|
Lu TY, Chaisson MJP. Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs. Nat Commun 2021; 12:4250. [PMID: 34253730 PMCID: PMC8275641 DOI: 10.1038/s41467-021-24378-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 06/10/2021] [Indexed: 12/11/2022] Open
Abstract
Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.
Collapse
Affiliation(s)
- Tsung-Yu Lu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
30
|
Schultz JL, Saft C, Nopoulos PC. Association of CAG Repeat Length in the Huntington Gene With Cognitive Performance in Young Adults. Neurology 2021; 96:e2407-e2413. [PMID: 33692166 PMCID: PMC10508647 DOI: 10.1212/wnl.0000000000011823] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 02/10/2021] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE To investigate the relationships between CAG repeat length in the huntingtin gene and cognitive performance in participants above and below the disease threshold for Huntington disease (HD), we performed a cross-sectional analysis of the Enroll-HD database. METHODS We analyzed data from young, developing adults (≤30 years of age) without a history of depression, apathy, or cognitive deficits. We included participants with and without the gene expansion (CAG ≥36) for HD. All participants had to have a Total Functional Capacity Score of 13, a diagnostic confidence level of zero, and a total motor score of <10 and had to be >28.6 years from their predicted motor onset. We performed regression analyses to investigate the nonlinear relationship between CAG repeat length and various cognitive measures controlling for age, sex, and education level. RESULTS There were significant positive relationships between CAG repeat length and the Symbol Digit Modalities, Stroop Color Naming, and Stroop Interference test scores. There were significant negative relationships between CAG repeat length and scores on Parts A and B of the Trails Making Test (p < 0.05), indicating that longer CAG repeat lengths were associated with better performance. DISCUSSION An increasing number of CAG repeats in the huntingtin gene below disease threshold and low pathologic CAG ranges were associated with some improvements in cognitive performance. These findings outline the relationship between CAG repeats within the huntingtin gene and cognitive development. CLASSIFICATION OF EVIDENCE This study provides Class IV evidence that CAG repeat length is positively associated with cognitive function across a spectrum of CAG repeat lengths.
Collapse
Affiliation(s)
- Jordan L Schultz
- From the Departments of Psychiatry (J.L.S., P.C.N.) and Neurology (J.L.S., P.C.N.), Carver College of Medicine at the University of Iowa; Division of Pharmacy Practice and Sciences (J.L.S.), University of Iowa College of Pharmacy, Iowa City; Department of Neurology (C.S.), Huntington Center NRW, Ruhr-University Bochum, St Josef-Hospital, Bochum, Germany; and Stead Family Children's Hospital at the University of Iowa (P.C.N.), Iowa City.
| | - Carsten Saft
- From the Departments of Psychiatry (J.L.S., P.C.N.) and Neurology (J.L.S., P.C.N.), Carver College of Medicine at the University of Iowa; Division of Pharmacy Practice and Sciences (J.L.S.), University of Iowa College of Pharmacy, Iowa City; Department of Neurology (C.S.), Huntington Center NRW, Ruhr-University Bochum, St Josef-Hospital, Bochum, Germany; and Stead Family Children's Hospital at the University of Iowa (P.C.N.), Iowa City
| | - Peggy C Nopoulos
- From the Departments of Psychiatry (J.L.S., P.C.N.) and Neurology (J.L.S., P.C.N.), Carver College of Medicine at the University of Iowa; Division of Pharmacy Practice and Sciences (J.L.S.), University of Iowa College of Pharmacy, Iowa City; Department of Neurology (C.S.), Huntington Center NRW, Ruhr-University Bochum, St Josef-Hospital, Bochum, Germany; and Stead Family Children's Hospital at the University of Iowa (P.C.N.), Iowa City
| |
Collapse
|
31
|
Bakhtiari M, Park J, Ding YC, Shleizer-Burko S, Neuhausen SL, Halldórsson BV, Stefánsson K, Gymrek M, Bafna V. Variable number tandem repeats mediate the expression of proximal genes. Nat Commun 2021; 12:2075. [PMID: 33824302 PMCID: PMC8024321 DOI: 10.1038/s41467-021-22206-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 02/17/2021] [Indexed: 12/12/2022] Open
Abstract
Variable number tandem repeats (VNTRs) account for significant genetic variation in many organisms. In humans, VNTRs have been implicated in both Mendelian and complex disorders, but are largely ignored by genomic pipelines due to the complexity of genotyping and the computational expense. We describe adVNTR-NN, a method that uses shallow neural networks to genotype a VNTR in 18 seconds on 55X whole genome data, while maintaining high accuracy. We use adVNTR-NN to genotype 10,264 VNTRs in 652 GTEx individuals. Associating VNTR length with gene expression in 46 tissues, we identify 163 "eVNTRs". Of the 22 eVNTRs in blood where independent data is available, 21 (95%) are replicated in terms of significance and direction of association. 49% of the eVNTR loci show a strong and likely causal impact on the expression of genes and 80% have maximum effect size at least 0.3. The impacted genes are involved in diseases including Alzheimer's, obesity and familial cancers, highlighting the importance of VNTRs for understanding the genetic basis of complex diseases.
Collapse
Affiliation(s)
- Mehrdad Bakhtiari
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Jonghun Park
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Yuan-Chun Ding
- Department of Population Sciences, Beckman Research Institute of City of Hope, Duarte, CA, USA
| | | | - Susan L Neuhausen
- Department of Population Sciences, Beckman Research Institute of City of Hope, Duarte, CA, USA
| | | | | | - Melissa Gymrek
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Vineet Bafna
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
32
|
|
33
|
Newton AH, Pask AJ. Evolution and expansion of the RUNX2 QA repeat corresponds with the emergence of vertebrate complexity. Commun Biol 2020; 3:771. [PMID: 33319865 PMCID: PMC7738678 DOI: 10.1038/s42003-020-01501-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Accepted: 11/10/2020] [Indexed: 11/08/2022] Open
Abstract
Runt-related transcription factor 2 (RUNX2) is critical for the development of the vertebrate bony skeleton. Unlike other RUNX family members, RUNX2 possesses a variable poly-glutamine, poly-alanine (QA) repeat domain. Natural variation within this repeat is able to alter the transactivation potential of RUNX2, acting as an evolutionary 'tuning knob' suggested to influence mammalian skull shape. However, the broader role of the RUNX2 QA repeat throughout vertebrate evolution is unknown. In this perspective, we examine the role of the RUNX2 QA repeat during skeletal development and discuss how its emergence and expansion may have facilitated the evolution of morphological novelty in vertebrates.
Collapse
Affiliation(s)
- Axel H Newton
- Biosciences 4, The School of Biosciences, The University of Melbourne, Royal Parade, Parkville, VIC, 3052, Australia.
- Anatomy and Developmental Biology, The School of Biomedical Sciences, Monash University, Clayton, VIC, 3800, Australia.
| | - Andrew J Pask
- Biosciences 4, The School of Biosciences, The University of Melbourne, Royal Parade, Parkville, VIC, 3052, Australia
| |
Collapse
|
34
|
Balzano E, Pelliccia F, Giunta S. Genome (in)stability at tandem repeats. Semin Cell Dev Biol 2020; 113:97-112. [PMID: 33109442 DOI: 10.1016/j.semcdb.2020.10.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 09/26/2020] [Accepted: 10/10/2020] [Indexed: 12/12/2022]
Abstract
Repeat sequences account for over half of the human genome and represent a significant source of variation that underlies physiological and pathological states. Yet, their study has been hindered due to limitations in short-reads sequencing technology and difficulties in assembly. A important category of repetitive DNA in the human genome is comprised of tandem repeats (TRs), where repetitive units are arranged in a head-to-tail pattern. Compared to other regions of the genome, TRs carry between 10 and 10,000 fold higher mutation rate. There are several mutagenic mechanisms that can give rise to this propensity toward instability, but their precise contribution remains speculative. Given the high degree of homology between these sequences and their arrangement in tandem, once damaged, TRs have an intrinsic propensity to undergo aberrant recombination with non-allelic exchange and generate harmful rearrangements that may undermine the stability of the entire genome. The dynamic mutagenesis at TRs has been found to underlie individual polymorphism associated with neurodegenerative and neuromuscular disorders, as well as complex genetic diseases like cancer and diabetes. Here, we review our current understanding of the surveillance and repair mechanisms operating within these regions, and we describe how alterations in these protective processes can readily trigger mutational signatures found at TRs, ultimately resulting in the pathological correlation between TRs instability and human diseases. Finally, we provide a viewpoint to counter the detrimental effects that TRs pose in light of their selection and conservation, as important drivers of human evolution.
Collapse
Affiliation(s)
- Elisa Balzano
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Franca Pelliccia
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Simona Giunta
- The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA; Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy.
| |
Collapse
|
35
|
Trost B, Engchuan W, Nguyen CM, Thiruvahindrapuram B, Dolzhenko E, Backstrom I, Mirceta M, Mojarad BA, Yin Y, Dov A, Chandrakumar I, Prasolava T, Shum N, Hamdan O, Pellecchia G, Howe JL, Whitney J, Klee EW, Baheti S, Amaral DG, Anagnostou E, Elsabbagh M, Fernandez BA, Hoang N, Lewis MES, Liu X, Sjaarda C, Smith IM, Szatmari P, Zwaigenbaum L, Glazer D, Hartley D, Stewart AK, Eberle MA, Sato N, Pearson CE, Scherer SW, Yuen RKC. Genome-wide detection of tandem DNA repeats that are expanded in autism. Nature 2020; 586:80-86. [PMID: 32717741 PMCID: PMC9348607 DOI: 10.1038/s41586-020-2579-z] [Citation(s) in RCA: 115] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Accepted: 06/05/2020] [Indexed: 12/31/2022]
Abstract
Tandem DNA repeats vary in the size and sequence of each unit (motif). When expanded, these tandem DNA repeats have been associated with more than 40 monogenic disorders1. Their involvement in disorders with complex genetics is largely unknown, as is the extent of their heterogeneity. Here we investigated the genome-wide characteristics of tandem repeats that had motifs with a length of 2-20 base pairs in 17,231 genomes of families containing individuals with autism spectrum disorder (ASD)2,3 and population control individuals4. We found extensive polymorphism in the size and sequence of motifs. Many of the tandem repeat loci that we detected correlated with cytogenetic fragile sites. At 2,588 loci, gene-associated expansions of tandem repeats that were rare among population control individuals were significantly more prevalent among individuals with ASD than their siblings without ASD, particularly in exons and near splice junctions, and in genes related to the development of the nervous system and cardiovascular system or muscle. Rare tandem repeat expansions had a prevalence of 23.3% in children with ASD compared with 20.7% in children without ASD, which suggests that tandem repeat expansions make a collective contribution to the risk of ASD of 2.6%. These rare tandem repeat expansions included previously undescribed ASD-linked expansions in DMPK and FXN, which are associated with neuromuscular conditions, and in previously unknown loci such as FGF14 and CACNB1. Rare tandem repeat expansions were associated with lower IQ and adaptive ability. Our results show that tandem DNA repeat expansions contribute strongly to the genetic aetiology and phenotypic complexity of ASD.
Collapse
Affiliation(s)
- Brett Trost
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Worrawat Engchuan
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Charlotte M Nguyen
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Bhooma Thiruvahindrapuram
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Ian Backstrom
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Mila Mirceta
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Bahareh A Mojarad
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Yue Yin
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Alona Dov
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Induja Chandrakumar
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Tanya Prasolava
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Natalie Shum
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Omar Hamdan
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Giovanna Pellecchia
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jennifer L Howe
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Joseph Whitney
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Eric W Klee
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
| | - Saurabh Baheti
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - David G Amaral
- MIND Institute and Department of Psychiatry and Behavioral Sciences, University of California Davis School of Medicine, Sacramento, CA, USA
| | - Evdokia Anagnostou
- Holland Bloorview Kids Rehabilitation Hospital, University of Toronto, Toronto, Ontario, Canada
| | - Mayada Elsabbagh
- Montreal Neurological Institute and Azrieli Centre for Autism Research, McGill University, Montreal, Quebec, Canada
| | - Bridget A Fernandez
- Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| | - Ny Hoang
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - M E Suzanne Lewis
- Medical Genetics, University of British Columbia (UBC), Vancouver, British Columbia, Canada
- BC Children's Hospital Research Institute, Vancouver, British Columbia, Canada
| | - Xudong Liu
- Department of Psychiatry, Queen's University, Kingston, Ontario, Canada
| | - Calvin Sjaarda
- Department of Psychiatry, Queen's University, Kingston, Ontario, Canada
| | - Isabel M Smith
- Department of Pediatrics, Dalhousie University, Halifax, Nova Scotia, Canada
- IWK Health Centre, Halifax, Nova Scotia, Canada
| | - Peter Szatmari
- Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
- Department of Psychiatry, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Lonnie Zwaigenbaum
- Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada
| | - David Glazer
- Verily Life Sciences, South San Francisco, CA, USA
| | | | - A Keith Stewart
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
- Division of Hematology, Mayo Clinic, Rochester, MN, USA
| | | | - Nozomu Sato
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Christopher E Pearson
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Stephen W Scherer
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- McLaughlin Centre, University of Toronto, Toronto, Ontario, Canada
| | - Ryan K C Yuen
- Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada.
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
36
|
Course MM, Gudsnuk K, Smukowski SN, Winston K, Desai N, Ross JP, Sulovari A, Bourassa CV, Spiegelman D, Couthouis J, Yu CE, Tsuang DW, Jayadev S, Kay MA, Gitler AD, Dupre N, Eichler EE, Dion PA, Rouleau GA, Valdmanis PN. Evolution of a Human-Specific Tandem Repeat Associated with ALS. Am J Hum Genet 2020; 107:445-460. [PMID: 32750315 DOI: 10.1016/j.ajhg.2020.07.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 07/08/2020] [Indexed: 12/12/2022] Open
Abstract
Tandem repeats are proposed to contribute to human-specific traits, and more than 40 tandem repeat expansions are known to cause neurological disease. Here, we characterize a human-specific 69 bp variable number tandem repeat (VNTR) in the last intron of WDR7, which exhibits striking variability in both copy number and nucleotide composition, as revealed by long-read sequencing. In addition, greater repeat copy number is significantly enriched in three independent cohorts of individuals with sporadic amyotrophic lateral sclerosis (ALS). Each unit of the repeat forms a stem-loop structure with the potential to produce microRNAs, and the repeat RNA can aggregate when expressed in cells. We leveraged its remarkable sequence variability to align the repeat in 288 samples and uncover its mechanism of expansion. We found that the repeat expands in the 3'-5' direction, in groups of repeat units divisible by two. The expansion patterns we observed were consistent with duplication events, and a replication error called template switching. We also observed that the VNTR is expanded in both Denisovan and Neanderthal genomes but is fixed at one copy or fewer in non-human primates. Evaluating the repeat in 1000 Genomes Project samples reveals that some repeat segments are solely present or absent in certain geographic populations. The large size of the repeat unit in this VNTR, along with our multiplexed sequencing strategy, provides an unprecedented opportunity to study mechanisms of repeat expansion, and a framework for evaluating the roles of VNTRs in human evolution and disease.
Collapse
|
37
|
Piras IS, Picinelli C, Iennaco R, Baccarin M, Castronovo P, Tomaiuolo P, Cucinotta F, Ricciardello A, Turriziani L, Nanetti L, Mariotti C, Gellera C, Lintas C, Sacco R, Zuccato C, Cattaneo E, Persico AM. Huntingtin gene CAG repeat size affects autism risk: Family-based and case-control association study. Am J Med Genet B Neuropsychiatr Genet 2020; 183:341-351. [PMID: 32652810 DOI: 10.1002/ajmg.b.32806] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 04/20/2020] [Accepted: 05/04/2020] [Indexed: 11/10/2022]
Abstract
The Huntingtin (HTT) gene contains a CAG repeat in exon 1, whose expansion beyond 39 repeats consistently leads to Huntington's disease (HD), whereas normal-to-intermediate alleles seemingly modulate brain structure, function and behavior. The role of the CAG repeat in Autism Spectrum Disorder (ASD) was investigated applying both family-based and case-control association designs, with the SCA3 repeat as a negative control. Significant overtransmission of "long" CAG alleles (≥17 repeats) to autistic children and of "short" alleles (≤16 repeats) to their unaffected siblings (all p < 10-5 ) was observed in 612 ASD families (548 simplex and 64 multiplex). Surprisingly, both 193 population controls and 1,188 neurological non-HD controls have significantly lower frequencies of "short" CAG alleles compared to 185 unaffected siblings and higher rates of "long" alleles compared to 548 ASD patients from the same families (p < .05-.001). The SCA3 CAG repeat displays no association. "Short" HTT alleles seemingly exert a protective effect from clinically overt autism in families carrying a genetic predisposition for ASD, while "long" alleles may enhance autism risk. Differential penetrance of autism-inducing genetic/epigenetic variants may imply atypical developmental trajectories linked to HTT functions, including excitation/inhibition imbalance, cortical neurogenesis and apoptosis, neuronal migration, synapse formation, connectivity and homeostasis.
Collapse
Affiliation(s)
- Ignazio Stefano Piras
- Neurogenomics Division, The Translational Genomics Research Institute, Phoenix, Arizona, USA
| | - Chiara Picinelli
- Mafalda Luce Center for Pervasive Developmental Disorders, Milan, Italy
| | - Raffaele Iennaco
- Department of Biosciences, Università degli Studi di Milano, Milan, Italy.,Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi", Milan, Italy
| | - Marco Baccarin
- Mafalda Luce Center for Pervasive Developmental Disorders, Milan, Italy
| | - Paola Castronovo
- Mafalda Luce Center for Pervasive Developmental Disorders, Milan, Italy
| | - Pasquale Tomaiuolo
- Interdepartmental Program "Autism 0-90", "Gaetano Martino" University Hospital, University of Messina, Messina, Italy
| | - Francesca Cucinotta
- Interdepartmental Program "Autism 0-90", "Gaetano Martino" University Hospital, University of Messina, Messina, Italy
| | - Arianna Ricciardello
- Interdepartmental Program "Autism 0-90", "Gaetano Martino" University Hospital, University of Messina, Messina, Italy
| | - Laura Turriziani
- Interdepartmental Program "Autism 0-90", "Gaetano Martino" University Hospital, University of Messina, Messina, Italy
| | - Lorenzo Nanetti
- Unit of Medical Genetics and Neurogenetics, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Caterina Mariotti
- Unit of Medical Genetics and Neurogenetics, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Cinzia Gellera
- Unit of Medical Genetics and Neurogenetics, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Carla Lintas
- Unit of Child and Adolescent NeuroPsychiatry & Laboratory of Molecular Psychiatry and Neurogenetics, University Campus Bio-Medico, Rome, Italy
| | - Roberto Sacco
- Unit of Child and Adolescent NeuroPsychiatry & Laboratory of Molecular Psychiatry and Neurogenetics, University Campus Bio-Medico, Rome, Italy
| | - Chiara Zuccato
- Department of Biosciences, Università degli Studi di Milano, Milan, Italy.,Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi", Milan, Italy
| | - Elena Cattaneo
- Department of Biosciences, Università degli Studi di Milano, Milan, Italy.,Istituto Nazionale di Genetica Molecolare "Romeo ed Enrica Invernizzi", Milan, Italy
| | - Antonio M Persico
- Interdepartmental Program "Autism 0-90", "Gaetano Martino" University Hospital, University of Messina, Messina, Italy
| |
Collapse
|
38
|
Expanding genes, repeating themes and therapeutic schemes: The neurobiology of tandem repeat disorders. Neurobiol Dis 2020; 144:105053. [PMID: 32810583 DOI: 10.1016/j.nbd.2020.105053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
|
39
|
Khristich AN, Mirkin SM. On the wrong DNA track: Molecular mechanisms of repeat-mediated genome instability. J Biol Chem 2020; 295:4134-4170. [PMID: 32060097 PMCID: PMC7105313 DOI: 10.1074/jbc.rev119.007678] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Expansions of simple tandem repeats are responsible for almost 50 human diseases, the majority of which are severe, degenerative, and not currently treatable or preventable. In this review, we first describe the molecular mechanisms of repeat-induced toxicity, which is the connecting link between repeat expansions and pathology. We then survey alternative DNA structures that are formed by expandable repeats and review the evidence that formation of these structures is at the core of repeat instability. Next, we describe the consequences of the presence of long structure-forming repeats at the molecular level: somatic and intergenerational instability, fragility, and repeat-induced mutagenesis. We discuss the reasons for gender bias in intergenerational repeat instability and the tissue specificity of somatic repeat instability. We also review the known pathways in which DNA replication, transcription, DNA repair, and chromatin state interact and thereby promote repeat instability. We then discuss possible reasons for the persistence of disease-causing DNA repeats in the genome. We describe evidence suggesting that these repeats are a payoff for the advantages of having abundant simple-sequence repeats for eukaryotic genome function and evolvability. Finally, we discuss two unresolved fundamental questions: (i) why does repeat behavior differ between model systems and human pedigrees, and (ii) can we use current knowledge on repeat instability mechanisms to cure repeat expansion diseases?
Collapse
Affiliation(s)
| | - Sergei M Mirkin
- Department of Biology, Tufts University, Medford, Massachusetts 02155.
| |
Collapse
|
40
|
Kinney N, Kang L, Eckstrand L, Pulenthiran A, Samuel P, Anandakrishnan R, Varghese RT, Michalak P, Garner HR. Abundance of ethnically biased microsatellites in human gene regions. PLoS One 2019; 14:e0225216. [PMID: 31830051 PMCID: PMC6907796 DOI: 10.1371/journal.pone.0225216] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 10/29/2019] [Indexed: 12/16/2022] Open
Abstract
Microsatellites-a type of short tandem repeat (STR)-have been used for decades as putatively neutral markers to study the genetic structure of diverse human populations. However, recent studies have demonstrated that some microsatellites contribute to gene expression, cis heritability, and phenotype. As a corollary, some microsatellites may contribute to differential gene expression and RNA/protein structure stability in distinct human populations. To test this hypothesis, we investigate genotype frequencies, functional relevance, and adaptive potential of microsatellites in five super-populations (ethnicities) drawn from the 1000 Genomes Project. We discover 3,984 ethnically-biased microsatellite loci (EBML); for each EBML at least one ethnicity has genotype frequencies statistically different from the remaining four. South Asian, East Asian, European, and American EBML show significant overlap; on the contrary, the set of African EBML is mostly unique. We cross-reference the 3,984 EBML with 2,060 previously identified expression STRs (eSTRs); repeats known to affect gene expression (64 total) are over-represented. The most significant pathway enrichments are those associated with the matrisome: a broad collection of genes encoding the extracellular matrix and its associated proteins. At least 14 of the EBML have established links to human disease. Analysis of the 3,984 EBML with respect to known selective sweep regions in the genome shows that allelic variation in some of them is likely associated with adaptive evolution.
Collapse
Affiliation(s)
- Nick Kinney
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Lin Kang
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Laurel Eckstrand
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
| | - Arichanah Pulenthiran
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Peter Samuel
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Ramu Anandakrishnan
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Robin T. Varghese
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - P. Michalak
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
- Institute of Evolution, University of Haifa, Haifa, Israel
| | - Harold R. Garner
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| |
Collapse
|
41
|
Dolzhenko E, Deshpande V, Schlesinger F, Krusche P, Petrovski R, Chen S, Emig-Agius D, Gross A, Narzisi G, Bowman B, Scheffler K, van Vugt JJFA, French C, Sanchis-Juan A, Ibáñez K, Tucci A, Lajoie BR, Veldink JH, Raymond FL, Taft RJ, Bentley DR, Eberle MA. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. BIOINFORMATICS (OXFORD, ENGLAND) 2019; 35:4754-4756. [PMID: 31134279 DOI: 10.1101/361162] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 04/26/2019] [Accepted: 05/23/2019] [Indexed: 05/25/2023]
Abstract
SUMMARY We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. AVAILABILITY AND IMPLEMENTATION ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Peter Krusche
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Roman Petrovski
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | - Sai Chen
- Illumina Inc., San Diego, CA 92122, USA
| | | | | | - Giuseppe Narzisi
- Computational Biology, New York Genome Center, New York, NY 10013, USA
| | | | | | - Joke J F A van Vugt
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - Courtney French
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | - Alba Sanchis-Juan
- Department of Haematology, University of Cambridge, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | - Kristina Ibáñez
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | - Arianna Tucci
- Genomics England, Queen Mary University London, London EC1M 6BQ, UK
| | | | - Jan H Veldink
- UMC Utrecht Brain Center, Utrecht University, 3508 AB Utrecht, The Netherlands
| | - F Lucy Raymond
- Department of Medical Genetics, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, UK
| | | | - David R Bentley
- Illumina Cambridge Ltd, Illumina Centre, 19 Granta Park, Great Abington, Cambridge CB21 6DF, UK
| | | |
Collapse
|
42
|
Mousavi N, Shleizer-Burko S, Yanicky R, Gymrek M. Profiling the genome-wide landscape of tandem repeat expansions. Nucleic Acids Res 2019; 47:e90. [PMID: 31194863 PMCID: PMC6735967 DOI: 10.1093/nar/gkz501] [Citation(s) in RCA: 121] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 05/15/2019] [Accepted: 05/28/2019] [Indexed: 12/15/2022] Open
Abstract
Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington's Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS.
Collapse
Affiliation(s)
- Nima Mousavi
- Department of Electrical and Computer Engineering, University of California San Diego, 9500 Gilman Drive, MC 0639, La Jolla, CA 92093, USA
| | - Sharona Shleizer-Burko
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, MC 0639, La Jolla, CA 92093, USA
| | - Richard Yanicky
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, MC 0639, La Jolla, CA 92093, USA
| | - Melissa Gymrek
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, MC 0639, La Jolla, CA 92093, USA
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, MC 0639, La Jolla, CA 92093, USA
| |
Collapse
|
43
|
Gardiner SL, Trompet S, Sabayan B, Boogaard MW, Jukema JW, Slagboom PE, Roos RAC, van der Grond J, Aziz NA. Repeat variations in polyglutamine disease-associated genes and cognitive function in old age. Neurobiol Aging 2019; 84:236.e17-236.e28. [PMID: 31522753 DOI: 10.1016/j.neurobiolaging.2019.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 08/03/2019] [Accepted: 08/04/2019] [Indexed: 02/03/2023]
Abstract
Although the heritability of cognitive function in old age is substantial, genome-wide association studies have had limited success in elucidating its genetic basis, leaving a considerable amount of "missing heritability." Aside from single nucleotide polymorphisms, genome-wide association studies are unable to assess other large sources of genetic variation, such as tandem repeat polymorphisms. Therefore, here, we studied the association of cytosine-adenine-guanine (CAG) repeat variations in polyglutamine disease-associated genes (PDAGs) with cognitive function in older adults. In a large cohort consisting of 5786 participants, we found that the CAG repeat number in 3 PDAGs (TBP, HTT, and AR) were significantly associated with the decline in cognitive function, which together accounted for 0.49% of the variation. Furthermore, in an magnetic resonance imaging substudy, we found that CAG repeat polymorphisms in 4 PDAGs (ATXN2, CACNA1A, ATXN7, and AR) were associated with different imaging characteristics, including brain stem, putamen, globus pallidus, thalamus, and amygdala volumes. Our findings indicate that tandem repeat polymorphisms are associated with cognitive function in older adults and highlight the importance of PDAGs in elucidating its missing heritability.
Collapse
Affiliation(s)
- Sarah L Gardiner
- Department of Neurology, Leiden University Medical Centre, Leiden, the Netherlands; Department of Human Genetics, Leiden University Medical Centre, Leiden, the Netherlands.
| | - Stella Trompet
- Department of Internal Medicine, Section of Gerontology and Geriatrics, Leiden University Medical Centre, Leiden, the Netherlands
| | - Behnam Sabayan
- The Ken and Ruth Davee Department of Neurology, Northwestern University, Chicago, IL, USA
| | - Merel W Boogaard
- Department of Clinical Genetics, Leiden University Medical Centre, Leiden, the Netherlands
| | - J Wouter Jukema
- Department of Cardiology, Leiden University Medical Centre, Leiden, the Netherlands
| | - P Eline Slagboom
- Department of Molecular Epidemiology, Leiden University Medical Centre, Leiden, the Netherlands
| | - Raymund A C Roos
- Department of Neurology, Leiden University Medical Centre, Leiden, the Netherlands
| | - Jeroen van der Grond
- Department of Radiology, Leiden University Medical Centre, Leiden, the Netherlands
| | - N Ahmad Aziz
- Population Health Sciences, German Centre for Neurodegenerative Diseases (DZNE), Bonn, Germany; Department of Neurology, University of Bonn, Bonn, Germany
| |
Collapse
|
44
|
Yeshurun S, Hannan AJ. Transgenerational epigenetic influences of paternal environmental exposures on brain function and predisposition to psychiatric disorders. Mol Psychiatry 2019. [PMID: 29520039 DOI: 10.1038/s41380-018-0039-z] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In recent years, striking new evidence has demonstrated non-genetic inheritance of acquired traits associated with parental environmental exposures. In particular, this transgenerational modulation of phenotypic traits is of direct relevance to psychiatric disorders, including depression, post-traumatic stress disorder, and other anxiety disorders. Here we review the recent progress in this field, with an emphasis on acquired traits of psychiatric illnesses transmitted epigenetically via the male lineage. We discuss the transgenerational effects of paternal exposure to stress vs. positive stimuli, such as exercise, and discuss their impact on the behavioral, affective and cognitive characteristics of their progeny. Furthermore, we review the recent evidence suggesting that these transgenerational effects are mediated by epigenetic mechanisms, including changes in DNA methylation and small non-coding RNAs in the sperm. We discuss the urgent need for more research exploring transgenerational epigenetic effects in animal models and human populations. These future studies may identify epigenetic mechanisms as potential contributors to the 'missing heritability' observed in genome-wide association studies of psychiatric illnesses and other human disorders. This exciting new field of transgenerational epigenomics will facilitate the development of novel strategies to predict, prevent and treat negative epigenetic consequences on offspring health, and psychiatric disorders in particular.
Collapse
Affiliation(s)
- Shlomo Yeshurun
- Florey Institute of Neuroscience and Mental Health, Melbourne Brain Centre, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, Melbourne Brain Centre, University of Melbourne, Parkville, VIC, 3010, Australia. .,Department of Anatomy and Neuroscience, University of Melbourne, Parkville, VIC, 3010, Australia.
| |
Collapse
|
45
|
Repeat length variations in ATXN1 and AR modify disease expression in Alzheimer's disease. Neurobiol Aging 2019; 73:230.e9-230.e17. [DOI: 10.1016/j.neurobiolaging.2018.09.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 08/07/2018] [Accepted: 09/07/2018] [Indexed: 02/06/2023]
|
46
|
Tyebji S, Seizova S, Hannan AJ, Tonkin CJ. Toxoplasmosis: A pathway to neuropsychiatric disorders. Neurosci Biobehav Rev 2018; 96:72-92. [PMID: 30476506 DOI: 10.1016/j.neubiorev.2018.11.012] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 10/23/2018] [Accepted: 11/22/2018] [Indexed: 12/24/2022]
Abstract
Toxoplasma gondii is an obligate intracellular parasite that resides, in a latent form, in the human central nervous system. Infection with Toxoplasma drastically alters the behaviour of rodents and is associated with the incidence of specific neuropsychiatric conditions in humans. But the question remains: how does this pervasive human pathogen alter behaviour of the mammalian host? This fundamental question is receiving increasing attention as it has far reaching public health implications for a parasite that is very common in human populations. Our current understanding centres on neuronal changes that are elicited directly by this intracellular parasite versus indirect changes that occur due to activation of the immune system within the CNS, or a combination of both. In this review, we explore the interactions between Toxoplasma and its host, the proposed mechanisms and consequences on neuronal function and mental health, and discuss Toxoplasma infection as a public health issue.
Collapse
Affiliation(s)
- Shiraz Tyebji
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, 3052, Australia; Department of Medical Biology, The University of Melbourne, Melbourne, 3052, Australia; Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, 3052, Victoria, Australia.
| | - Simona Seizova
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, 3052, Australia; Department of Medical Biology, The University of Melbourne, Melbourne, 3052, Australia.
| | - Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, 3052, Victoria, Australia; Department of Anatomy and Neuroscience, University of Melbourne, Parkville, 3052, Victoria, Australia.
| | - Christopher J Tonkin
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, 3052, Australia; Department of Medical Biology, The University of Melbourne, Melbourne, 3052, Australia.
| |
Collapse
|
47
|
Saini S, Mitra I, Mousavi N, Fotsing SF, Gymrek M. A reference haplotype panel for genome-wide imputation of short tandem repeats. Nat Commun 2018; 9:4397. [PMID: 30353011 PMCID: PMC6199332 DOI: 10.1038/s41467-018-06694-0] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 09/18/2018] [Indexed: 12/14/2022] Open
Abstract
Short tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in complex traits. However, genotyping arrays used in genome-wide association studies focus on single nucleotide polymorphisms (SNPs) and do not readily allow identification of STR associations. We leverage next-generation sequencing (NGS) from 479 families to create a SNP + STR reference haplotype panel. Our panel enables imputing STR genotypes into SNP array data when NGS is not available for directly genotyping STRs. Imputed genotypes achieve mean concordance of 97% with observed genotypes in an external dataset compared to 71% expected under a naive model. Performance varies widely across STRs, with near perfect concordance at bi-allelic STRs vs. 70% at highly polymorphic repeats. Imputation increases power over individual SNPs to detect STR associations with gene expression. Imputing STRs into existing SNP datasets will enable the first large-scale STR association studies across a range of complex traits.
Collapse
Affiliation(s)
- Shubham Saini
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Ileena Mitra
- Bioinformatics and Systems Biology Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Stephanie Feupe Fotsing
- Bioinformatics and Systems Biology Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
- Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
- Department of Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
| |
Collapse
|
48
|
Song JHT, Lowe CB, Kingsley DM. Characterization of a Human-Specific Tandem Repeat Associated with Bipolar Disorder and Schizophrenia. Am J Hum Genet 2018; 103:421-430. [PMID: 30100087 PMCID: PMC6128321 DOI: 10.1016/j.ajhg.2018.07.011] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 07/13/2018] [Indexed: 10/28/2022] Open
Abstract
Bipolar disorder (BD) and schizophrenia (SCZ) are highly heritable diseases that affect more than 3% of individuals worldwide. Genome-wide association studies have strongly and repeatedly linked risk for both of these neuropsychiatric diseases to a 100 kb interval in the third intron of the human calcium channel gene CACNA1C. However, the causative mutation is not yet known. We have identified a human-specific tandem repeat in this region that is composed of 30 bp units, often repeated hundreds of times. This large tandem repeat is unstable using standard polymerase chain reaction and bacterial cloning techniques, which may have resulted in its incorrect size in the human reference genome. The large 30-mer repeat region is polymorphic in both size and sequence in human populations. Particular sequence variants of the 30-mer are associated with risk status at several flanking single-nucleotide polymorphisms in the third intron of CACNA1C that have previously been linked to BD and SCZ. The tandem repeat arrays function as enhancers that increase reporter gene expression in a human neural progenitor cell line. Different human arrays vary in the magnitude of enhancer activity, and the 30-mer arrays associated with increased psychiatric disease risk status have decreased enhancer activity. Changes in the structure and sequence of these arrays likely contribute to changes in CACNA1C function during human evolution and may modulate neuropsychiatric disease risk in modern human populations.
Collapse
Affiliation(s)
- Janet H T Song
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Craig B Lowe
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
| | - David M Kingsley
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
49
|
Repeat length variations in polyglutamine disease-associated genes affect body mass index. Int J Obes (Lond) 2018; 43:440-449. [PMID: 30120431 DOI: 10.1038/s41366-018-0161-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Revised: 05/15/2018] [Accepted: 06/15/2018] [Indexed: 11/08/2022]
Abstract
BACKGROUND The worldwide prevalence of obesity, a major risk factor for numerous debilitating chronic disorders, is increasing rapidly. Although a substantial amount of the variation in body mass index (BMI) is estimated to be heritable, the largest meta-analysis of genome-wide association studies (GWAS) to date explained only ~2.7% of the variation. To tackle this 'missing heritability' problem of obesity, here we focused on the contribution of DNA repeat length polymorphisms which are not detectable by GWAS. SUBJECTS AND METHODS We determined the cytosine-adenine-guanine (CAG) repeat length in the nine known polyglutamine disease-associated genes (ATXN1, ATXN2, ATXN3, CACNA1A, ATXN7, TBP, HTT, ATN1 and AR) in two large cohorts consisting of 12,457 individuals and analyzed their association with BMI, using generalized linear mixed-effect models. RESULTS We found a significant association between BMI and the length of CAG repeats in seven polyglutamine disease-associated genes (including ATXN1, ATXN2, ATXN3, CACNA1A, ATXN7, TBP and AR). Importantly, these repeat variations could account for 0.75% of the total BMI variation. CONCLUSIONS Our findings incriminate repeat polymorphisms as an important novel class of genetic risk factors of obesity and highlight the role of the brain in its pathophysiology.
Collapse
|
50
|
Wu WJ, Liu KQ, Li BJ, Dong C, Zhang ZK, Li PH, Huang RH, Wei W, Chen J, Liu HL. Identification of an (AC)n microsatellite in the Six1 gene promoter and its effect on production traits in Pietrain × Duroc × Landrace × Yorkshire pigs. J Anim Sci 2018; 96:17-26. [PMID: 29432614 DOI: 10.1093/jas/skx024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2017] [Accepted: 02/05/2018] [Indexed: 12/22/2022] Open
Abstract
The Sine oculis homeobox 1 (Six1) gene is important for skeletal muscle growth and fiber specification; therefore, it is considered as a promising candidate gene that may influence porcine growth and meat quality traits. Nevertheless, the association of Six1 with these processes and the mechanisms regulating its expression remain unclear. The objectives of this study were to identify variant sites of Six1 in different pig breeds, conduct association analysis to evaluate the relationship between polymorphisms of these variants and porcine production traits in Pietrain × Duroc × Landrace × Yorkshire commercial pigs, and explore the potential regulatory mechanisms of Six1 affecting production traits. A total of 12 variants were identified, including 10 single- nucleotide variations (SNVs), 1 insertion- deletion (Indel), and 1 (AC)n microsatellite. Association analysis demonstrated that the SNV, g.1595A>G, was significantly associated with meat color (redness, a*); individuals with the G allele had greater a* values (P < 0.05). Moreover, our results demonstrated that the (AC)n polymorphism in the Six1 promoter was significantly associated with weaning weight (P < 0.05), carcass weight (P < 0.05), and thoracic and lumbar back fat (P < 0.01).In addition, we found that the (AC)n variant was closely related with Six1 expression levels and demonstrated this polymorphism on promoter activity by in vitro experiments. Overall, this study provides novel evidence for elucidating the effects of Six1 on porcine production traits as promising candidate and describes two variants with these traits, which are potential reference markers for pig molecular breeding. In addition, our data on the relationship between porcine Six1 expression and the polymorphic (AC)n microsatellite in its promoter may facilitate similar studies in other species.
Collapse
Affiliation(s)
- W J Wu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - K Q Liu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - B J Li
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - C Dong
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Z K Zhang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - P H Li
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - R H Huang
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - W Wei
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - J Chen
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - H L Liu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|