1
|
Wang P, Sheng X, Xia X, Wang F, Li R, Ahmed Z, Chen N, Lei C, Ma Z. The genomic landscape of short tandem repeats in cattle. Anim Genet 2025; 56:e13498. [PMID: 39692037 DOI: 10.1111/age.13498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 12/04/2024] [Accepted: 12/05/2024] [Indexed: 12/19/2024]
Abstract
Short tandem repeats (STRs) are abundant and have high mutation rates across cattle genomes; however, comprehensive exploration of cattle STRs is needed. Here, we constructed a comprehensive map of 467 553 polymorphic STRs (pSTRs) constructed from 423 cattle genomes representing 59 breeds worldwide. We observed that pSTRs in coding sequences and 5'UTRs (Untranslated Regions) were under strong selective constraints and exhibited a relatively low level of diversity. Furthermore, we found that these pSTRs underwent more contraction than expansion. Population analysis showed a strong positive correlation (R = 1) between pSTR diversity and single nucleotide polymorphic heterozygosity. We also investigated STR differences between taurine and indicine cattle and detected 2301 highly divergent STRs, which might relate to immune, endocrine and neurodevelopmental pathways. In summary, our large-scale study characterizes the spectrum of STRs in cattle, expands the scale of known cattle STR variation and provides novel insights into differences among various cattle subspecies.
Collapse
Affiliation(s)
- Pengfei Wang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Xin Sheng
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Academy of Animal Science and Veterinary Medicine, Qinghai University, Xining, China
| | - Xiaoting Xia
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Fuwen Wang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Ruizhe Li
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Academy of Animal Science and Veterinary Medicine, Qinghai University, Xining, China
| | - Zulfiqar Ahmed
- Department of Livestock and Poultry Production, Faculty of Veterinary and Animal Sciences, University of Poonch Rawalakot, Azad Jammu and Kashmir, Pakistan
| | - Ningbo Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Chuzhao Lei
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Zhijie Ma
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xining, China
- Academy of Animal Science and Veterinary Medicine, Qinghai University, Xining, China
| |
Collapse
|
2
|
Cui Y, Arnold FJ, Li JS, Wu J, Wang D, Philippe J, Colwin MR, Michels S, Chen C, Sallam T, Thompson LM, La Spada AR, Li W. Multi-omic quantitative trait loci link tandem repeat size variation to gene regulation in human brain. Nat Genet 2025:10.1038/s41588-024-02057-2. [PMID: 39809899 DOI: 10.1038/s41588-024-02057-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 12/10/2024] [Indexed: 01/16/2025]
Abstract
Tandem repeat (TR) size variation is implicated in ~50 neurological disorders, yet its impact on gene regulation in the human brain remains largely unknown. In the present study, we quantified the impact of TR size variation on brain gene regulation across distinct molecular phenotypes, based on 4,412 multi-omics samples from 1,597 donors, including 1,586 newly sequenced ones. We identified ~2.2 million TR molecular quantitative trait loci (TR-xQTLs), linking ~139,000 unique TRs to nearby molecular phenotypes, including many known disease-risk TRs, such as the G2C4 expansion in C9orf72 associated with amyotrophic lateral sclerosis. Fine-mapping revealed ~18,700 TRs as potential causal variants. Our in vitro experiments further confirmed the causal and independent regulatory effects of three TRs. Additional colocalization analysis indicated the potential causal role of TR variation in brain-related phenotypes, highlighted by a 3'-UTR TR in NUDT14 linked to cortical surface area and a TG repeat in PLEKHA1, associated with Alzheimer's disease.
Collapse
Affiliation(s)
- Ya Cui
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| | - Frederick J Arnold
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Jason Sheng Li
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Jie Wu
- Departments of Psychiatry and Human Behavior, Neurobiology and Behavior, and Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Dan Wang
- Division of Cardiology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Julien Philippe
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Michael R Colwin
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Sebastian Michels
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
- Department of Neurology, University of Ulm, Oberer Eselsberg, Ulm, Germany
| | - Chaorong Chen
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Tamer Sallam
- Division of Cardiology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Leslie M Thompson
- Departments of Psychiatry and Human Behavior, Neurobiology and Behavior, and Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| | - Albert R La Spada
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA.
- UCI Center for Neurotherapeutics, University of California, Irvine, Irvine, CA, USA.
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| |
Collapse
|
3
|
Vorstman J, Sebat J, Bourque VR, Jacquemont S. Integrative genetic analysis: cornerstone of precision psychiatry. Mol Psychiatry 2025; 30:229-236. [PMID: 39215185 DOI: 10.1038/s41380-024-02706-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/13/2024] [Accepted: 08/19/2024] [Indexed: 09/04/2024]
Abstract
The role of genetic testing in the domain of neurodevelopmental and psychiatric disorders (NPDs) is gradually changing from providing etiological explanation for the presence of NPD phenotypes to also identifying young individuals at high risk of developing NPDs before their clinical manifestation. In clinical practice, the latter implies a shift towards the availability of individual genetic information predicting a certain liability to develop an NPD (e.g., autism, intellectual disability, psychosis etc.). The shift from mostly a posteriori explanation to increasingly a priori risk prediction is the by-product of the systematic implementation of whole exome or genome sequencing as part of routine diagnostic work-ups during the neonatal and prenatal periods. This rapid uptake of genetic testing early in development has far-reaching consequences for psychiatry: Whereas until recently individuals would come to medical attention because of signs of abnormal developmental and/or behavioral symptoms, increasingly, individuals are presented based on genetic liability for NPD outcomes before NPD symptoms emerge. This novel clinical scenario, while challenging, also creates opportunities for research on prevention interventions and precision medicine approaches. Here, we review why optimization of individual risk prediction is a key prerequisite for precision medicine in the sphere of NPDs, as well as the technological and statistical methods required to achieve this ambition.
Collapse
Affiliation(s)
- Jacob Vorstman
- Department of Psychiatry, The Hospital for Sick Children, Toronto, ON, Canada.
- Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON, Canada.
| | - Jonathan Sebat
- Department of Psychiatry, Department of Cellular & Molecular Medicine, Beyster Center of Psychiatric Genomics, University of California San Diego, San Diego, CA, USA
| | - Vincent-Raphaël Bourque
- Centre de Recherche du Centre Hospitalier Universitaire Sainte-Justine, Montréal, QC, Canada
- Department of Psychiatry, McGill University, Montréal, QC, Canada
| | - Sébastien Jacquemont
- Centre de Recherche du Centre Hospitalier Universitaire Sainte-Justine, Montréal, QC, Canada
- Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
4
|
Provatas K, Chantzi N, Patsakis M, Nayak A, Mouratidis I, Georgakopoulos-Soares I. Microsatellites explorer: A database of short tandem repeats across genomes. Comput Struct Biotechnol J 2024; 23:3817-3826. [PMID: 39525087 PMCID: PMC11550718 DOI: 10.1016/j.csbj.2024.10.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 10/24/2024] [Accepted: 10/24/2024] [Indexed: 11/16/2024] Open
Abstract
Short tandem repeats (STRs) are widespread, repetitive elements, with a number of biological functions and are among the most rapidly mutating regions in the genome. Their distribution varies significantly between taxonomic groups in the tree of life and are highly polymorphic within the human population. Advances in sequencing technologies coupled with decreasing costs have enabled the generation of an ever-growing number of complete genomes. Additionally, the arrival of accurate long reads has facilitated the generation of Telomere-to-Telomere (T2T) assemblies of complete genomes. Nevertheless, there is no comprehensive database that encompasses the STRs found per genome across different organisms and for different human genomes across diverse ancestries. Here we introduce Microsatellites Explorer, a database of STRs found in the genomes of 117,253 organisms across all major taxonomic groups, 15 T2T genome assemblies of different organisms, and 94 human haplotypes from the human pangenome. The database currently hosts 406,758,798 STR sequences, serving as a centralized user-friendly repository to perform searches, interactive visualizations, and download existing STR data for independent analysis. Microsatellites Explorer is implemented as a web-portal for browsing, analyzing and downloading STR data. Microsatellites Explorer is publicly available at https://www.microsatellitesexplorer.com.
Collapse
Affiliation(s)
- Kimonas Provatas
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Michail Patsakis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Akshatha Nayak
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
5
|
Olson DR, Wheeler TJ. ULTRA-effective labeling of tandem repeats in genomic sequence. BIOINFORMATICS ADVANCES 2024; 4:vbae149. [PMID: 39575229 PMCID: PMC11580682 DOI: 10.1093/bioadv/vbae149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 09/19/2024] [Accepted: 10/07/2024] [Indexed: 11/24/2024]
Abstract
In the age of long read sequencing, genomics researchers now have access to accurate repetitive DNA sequence (including satellites) that, due to the limitations of short read-sequencing, could previously be observed only as unmappable fragments. Tools that annotate repetitive sequence are now more important than ever, so that we can better understand newly uncovered repetitive sequences, and also so that we can mitigate errors in bioinformatic software caused by those repetitive sequences. To that end, we introduce the 1.0 release of our tool for identifying and annotating locally repetitive sequence, ULTRA Locates Tandemly Repetitive Areas (ULTRA). ULTRA is fast enough to use as part of an efficient annotation pipeline, produces state-of-the-art reliable coverage of repetitive regions containing many mutations, and provides interpretable statistics and labels for repetitive regions. Availability and implementation ULTRA is released under an open source license, and is available for download at https://github.com/TravisWheelerLab/ULTRA.
Collapse
Affiliation(s)
- Daniel R Olson
- Department of Computer Science, University of Montana, Missoula, MT 59812, United States
| | - Travis J Wheeler
- Department of Computer Science, University of Montana, Missoula, MT 59812, United States
- Department of Pharmacy Practice & Science, R. Ken Coit College of Pharmacy, University of Arizona, Tucson, AZ 85721, United States
| |
Collapse
|
6
|
Chen Z, Morris HR, Polke J, Wood NW, Gandhi S, Ryten M, Houlden H, Tucci A. Repeat expansion disorders. Pract Neurol 2024:pn-2023-003938. [PMID: 39349043 DOI: 10.1136/pn-2023-003938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2024] [Indexed: 10/02/2024]
Abstract
An increasing number of repeat expansion disorders have been found to cause both rare and common neurological disease. This is exemplified in recent discoveries of novel repeat expansions underlying a significant proportion of several late-onset neurodegenerative disorders, such as CANVAS (cerebellar ataxia, neuropathy and vestibular areflexia syndrome) and spinocerebellar ataxia type 27B. Most of the 60 described repeat expansion disorders to date are associated with neurological disease, providing substantial challenges for diagnosis, but also opportunities for management in a clinical neurology setting. Commonalities in clinical presentation, overarching diagnostic features and similarities in the approach to genetic testing justify considering these disorders collectively based on their unifying causative mechanism. In this review, we discuss the characteristics and diagnostic challenges of repeat expansion disorders for the neurologist and provide examples to highlight their clinical heterogeneity. With the ready availability of clinical-grade whole-genome sequencing for molecular diagnosis, we discuss the current approaches to testing for repeat expansion disorders and application in clinical practice.
Collapse
Affiliation(s)
- Zhongbo Chen
- Department of Clinical and Movement Neuroscience, University College London Queen Square Institute of Neurology, London, UK
- The Francis Crick Institute, London, UK
| | - Huw R Morris
- Department of Clinical and Movement Neuroscience, University College London Queen Square Institute of Neurology, London, UK
| | - James Polke
- The Neurogenetics Laboratory, National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, London, UK
| | - Nicholas W Wood
- Department of Clinical and Movement Neuroscience, University College London Queen Square Institute of Neurology, London, UK
| | - Sonia Gandhi
- Department of Clinical and Movement Neuroscience, University College London Queen Square Institute of Neurology, London, UK
- The Francis Crick Institute, London, UK
| | - Mina Ryten
- UK Dementia Research Institute at University of Cambridge, Cambridge, UK
| | - Henry Houlden
- Department of Neuromuscular Disease, University College London Queen Square Institute of Neurology, London, UK
| | - Arianna Tucci
- William Harvey Institute, Queen Mary University of London, London, UK
| |
Collapse
|
7
|
Hu Z, Lin G, Zhang M, Piao S, Fan J, Liu J, Liu P, Fu S, Sun W, Li L, Qiu X, Zhang J, Yang Y, Zhou C. Mechanistic Characterization of De Novo Generation of Variable Number Tandem Repeats in Circular Plasmids during Site-Directed Mutagenesis and Optimization for Coding Gene Application. Adv Biol (Weinh) 2024; 8:e2400084. [PMID: 38880850 DOI: 10.1002/adbi.202400084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/21/2024] [Indexed: 06/18/2024]
Abstract
Site-directed mutagenesis for creating point mutations, sometimes, gives rise to plasmids carrying variable number tandem repeats (VNTRs) locally, which are arbitrarily regarded as polymerase chain reaction (PCR) related artifacts. Here, the alternative end-joining mechanism is reported rather than PCR artifacts accounts largely for that VNTRs formation and expansion. During generating a point mutation on GPLD1 gene, an unexpected formation of VNTRs employing the 31 bp mutagenesis primers is observed as the repeat unit in the pcDNA3.1-GPLD1 plasmid. The 31 bp VNTRs are formed in 24.75% of the resulting clones with copy number varied from 2 to 13. All repeat units are aligned with the same orientation as GPLD1 gene. 43.54% of the repeat junctions harbor nucleotide mutations while the rest don't. Their demonstrated short primers spanning the 3' part of the mutagenesis primers are essential for initial creation of the 2-copy tandem repeats (TRs) in circular plasmids. The dimerization of mutagenesis primers by the alternative end-joining in a correct orientation is required for further expansion of the 2-copy TRs. Lastly, a half-double priming strategy is established, verified the findings and offered a simple method for VNTRs creation on coding genes in circular plasmids without junction mutations.
Collapse
Affiliation(s)
- Ziqi Hu
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Guochao Lin
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Mingzhu Zhang
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Shengwen Piao
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Jiankun Fan
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Jichao Liu
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Peng Liu
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
| | - Songbin Fu
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
- Key Laboratory of Preservation of Human Genetic Resources and Disease Control in China, Harbin Medical University, Ministry of Education, China
| | - Wenjing Sun
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
- Key Laboratory of Preservation of Human Genetic Resources and Disease Control in China, Harbin Medical University, Ministry of Education, China
| | - Li Li
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Xiaohong Qiu
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Jinwei Zhang
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Yu Yang
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Chunshui Zhou
- The Laboratory of Medical Genetics, Harbin Medical University, Harbin, 150081, China
- The Second Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
- Key Laboratory of Preservation of Human Genetic Resources and Disease Control in China, Harbin Medical University, Ministry of Education, China
| |
Collapse
|
8
|
Uguen K, Michaud JL, Génin E. Short Tandem Repeats in the era of next-generation sequencing: from historical loci to population databases. Eur J Hum Genet 2024; 32:1037-1044. [PMID: 38982300 PMCID: PMC11369099 DOI: 10.1038/s41431-024-01666-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/11/2024] Open
Abstract
In this study, we explore the landscape of short tandem repeats (STRs) within the human genome through the lens of evolving technologies to detect genomic variations. STRs, which encompass approximately 3% of our genomic DNA, are crucial for understanding human genetic diversity, disease mechanisms, and evolutionary biology. The advent of high-throughput sequencing methods has revolutionized our ability to accurately map and analyze STRs, highlighting their significance in genetic disorders, forensic science, and population genetics. We review the current available methodologies for STR analysis, the challenges in interpreting STR variations across different populations, and the implications of STRs in medical genetics. Our findings underscore the urgent need for comprehensive STR databases that reflect the genetic diversity of global populations, facilitating the interpretation of STR data in clinical diagnostics, genetic research, and forensic applications. This work sets the stage for future studies aimed at harnessing STR variations to elucidate complex genetic traits and diseases, reinforcing the importance of integrating STRs into genetic research and clinical practice.
Collapse
Affiliation(s)
- Kevin Uguen
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.
- Service de Génétique Médicale et Biologie de la Reproduction, CHU de Brest, Brest, France.
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada.
| | - Jacques L Michaud
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
- Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | | |
Collapse
|
9
|
Ivancevic A, Simpson DM, Joyner OM, Bagby SM, Nguyen LL, Bitler BG, Pitts TM, Chuong EB. Endogenous retroviruses mediate transcriptional rewiring in response to oncogenic signaling in colorectal cancer. SCIENCE ADVANCES 2024; 10:eado1218. [PMID: 39018396 PMCID: PMC466953 DOI: 10.1126/sciadv.ado1218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 06/13/2024] [Indexed: 07/19/2024]
Abstract
Cancer cells exhibit rewired transcriptional regulatory networks that promote tumor growth and survival. However, the mechanisms underlying the formation of these pathological networks remain poorly understood. Through a pan-cancer epigenomic analysis, we found that primate-specific endogenous retroviruses (ERVs) are a rich source of enhancers displaying cancer-specific activity. In colorectal cancer and other epithelial tumors, oncogenic MAPK/AP1 signaling drives the activation of enhancers derived from the primate-specific ERV family LTR10. Functional studies in colorectal cancer cells revealed that LTR10 elements regulate tumor-specific expression of multiple genes associated with tumorigenesis, such as ATG12 and XRCC4. Within the human population, individual LTR10 elements exhibit germline and somatic structural variation resulting from a highly mutable internal tandem repeat region, which affects AP1 binding activity. Our findings reveal that ERV-derived enhancers contribute to transcriptional dysregulation in response to oncogenic signaling and shape the evolution of cancer-specific regulatory networks.
Collapse
Affiliation(s)
- Atma Ivancevic
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - David M. Simpson
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Olivia M. Joyner
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Stacey M. Bagby
- Division of Medical Oncology, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Lily L. Nguyen
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
- Division of Reproductive Sciences, Department of Obstetrics and Gynecology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Ben G. Bitler
- Division of Reproductive Sciences, Department of Obstetrics and Gynecology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Todd M. Pitts
- Division of Medical Oncology, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Edward B. Chuong
- BioFrontiers Institute and Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, USA
| |
Collapse
|
10
|
Plavskin Y, de Biase MS, Ziv N, Janská L, Zhu YO, Hall DW, Schwarz RF, Tranchina D, Siegal ML. Spontaneous single-nucleotide substitutions and microsatellite mutations have distinct distributions of fitness effects. PLoS Biol 2024; 22:e3002698. [PMID: 38950062 PMCID: PMC11244821 DOI: 10.1371/journal.pbio.3002698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/12/2024] [Accepted: 06/04/2024] [Indexed: 07/03/2024] Open
Abstract
The fitness effects of new mutations determine key properties of evolutionary processes. Beneficial mutations drive evolution, yet selection is also shaped by the frequency of small-effect deleterious mutations, whose combined effect can burden otherwise adaptive lineages and alter evolutionary trajectories and outcomes in clonally evolving organisms such as viruses, microbes, and tumors. The small effect sizes of these important mutations have made accurate measurements of their rates difficult. In microbes, assessing the effect of mutations on growth can be especially instructive, as this complex phenotype is closely linked to fitness in clonally evolving organisms. Here, we perform high-throughput time-lapse microscopy on cells from mutation-accumulation strains to precisely infer the distribution of mutational effects on growth rate in the budding yeast, Saccharomyces cerevisiae. We show that mutational effects on growth rate are overwhelmingly negative, highly skewed towards very small effect sizes, and frequent enough to suggest that deleterious hitchhikers may impose a significant burden on evolving lineages. By using lines that accumulated mutations in either wild-type or slippage repair-defective backgrounds, we further disentangle the effects of 2 common types of mutations, single-nucleotide substitutions and simple sequence repeat indels, and show that they have distinct effects on yeast growth rate. Although the average effect of a simple sequence repeat mutation is very small (approximately 0.3%), many do alter growth rate, implying that this class of frequent mutations has an important evolutionary impact.
Collapse
Affiliation(s)
- Yevgeniy Plavskin
- Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
- Department of Biology, New York University, New York, New York, United States of America
| | - Maria Stella de Biase
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
- Humboldt-Universität zu Berlin, Department of Biology, Berlin, Germany
| | - Naomi Ziv
- Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
- Department of Biology, New York University, New York, New York, United States of America
| | - Libuše Janská
- Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
- Department of Biology, New York University, New York, New York, United States of America
| | - Yuan O. Zhu
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - David W. Hall
- Department of Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Roland F. Schwarz
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
- Institute for Computational Cancer Biology, Center for Integrated Oncology (CIO), Cancer Research Center Cologne Essen (CCCE), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Berlin Institute for the Foundations of Learning and Data (BIFOLD), Berlin, Germany
| | - Daniel Tranchina
- Department of Biology, New York University, New York, New York, United States of America
- Courant Math Institute, New York University, New York, New York, United States of America
| | - Mark L. Siegal
- Center for Genomics and Systems Biology, New York University, New York, New York, United States of America
- Department of Biology, New York University, New York, New York, United States of America
| |
Collapse
|
11
|
van der Sanden B, Neveling K, Pang AWC, Shukor S, Gallagher MD, Burke SL, Kamsteeg EJ, Hastie A, Hoischen A. Optical Genome Mapping for Applications in Repeat Expansion Disorders. Curr Protoc 2024; 4:e1094. [PMID: 38966883 DOI: 10.1002/cpz1.1094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2024]
Abstract
Short tandem repeat (STR) expansions are associated with more than 60 genetic disorders. The size and stability of these expansions correlate with the severity and age of onset of the disease. Therefore, being able to accurately detect the absolute length of STRs is important. Current diagnostic assays include laborious lab experiments, including repeat-primed PCR and Southern blotting, that still cannot precisely determine the exact length of very long repeat expansions. Optical genome mapping (OGM) is a cost-effective and easy-to-use alternative to traditional cytogenetic techniques and allows the comprehensive detection of chromosomal aberrations and structural variants >500 bp in length, including insertions, deletions, duplications, inversions, translocations, and copy number variants. Here, we provide methodological guidance for preparing samples and performing OGM as well as running the analysis pipelines and using the specific repeat expansion workflows to determine the exact repeat length of repeat expansions expanded beyond 500 bp. Together these protocols provide all details needed to analyze the length and stability of any repeat expansion with an expected repeat size difference from the expected wild-type allele of >500 bp. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Genomic ultra-high-molecular-weight DNA isolation, labeling, and staining Basic Protocol 2: Data generation and genome mapping using the Bionano Saphyr® System Basic Protocol 3: Manual De Novo Assembly workflow Basic Protocol 4: Local guided assembly workflow Basic Protocol 5: EnFocus Fragile X workflow Basic Protocol 6: Molecule distance script workflow.
Collapse
Affiliation(s)
- Bart van der Sanden
- Department of Human Genetics, Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Kornelia Neveling
- Department of Human Genetics, Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, the Netherlands
| | | | - Syukri Shukor
- Bionano Genomics Clinical and Scientific Affairs, San Diego, California
| | | | - Stephanie L Burke
- Bionano Genomics Clinical and Scientific Affairs, San Diego, California
| | - Erik-Jan Kamsteeg
- Department of Human Genetics, Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Alex Hastie
- Bionano Genomics Clinical and Scientific Affairs, San Diego, California
| | - Alexander Hoischen
- Department of Human Genetics, Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, the Netherlands
- Department of Internal Medicine, Radboud Expertise Center for Immunodeficiency and Autoinflammation and Radboud Center for Infectious Disease (RCI), Radboud University Medical Center, Nijmegen, the Netherlands
| |
Collapse
|
12
|
Plavskin Y, de Biase MS, Ziv N, Janská L, Zhu YO, Hall DW, Schwarz RF, Tranchina D, Siegal ML. Spontaneous single-nucleotide substitutions and microsatellite mutations have distinct distributions of fitness effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.04.547687. [PMID: 37461506 PMCID: PMC10349969 DOI: 10.1101/2023.07.04.547687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
The fitness effects of new mutations determine key properties of evolutionary processes. Beneficial mutations drive evolution, yet selection is also shaped by the frequency of small-effect deleterious mutations, whose combined effect can burden otherwise adaptive lineages and alter evolutionary trajectories and outcomes in clonally evolving organisms such as viruses, microbes, and tumors. The small effect sizes of these important mutations have made accurate measurements of their rates difficult. In microbes, assessing the effect of mutations on growth can be especially instructive, as this complex phenotype is closely linked to fitness in clonally evolving organisms. Here, we perform high-throughput time-lapse microscopy on cells from mutation-accumulation strains to precisely infer the distribution of mutational effects on growth rate in the budding yeast, Saccharomyces cerevisiae. We show that mutational effects on growth rate are overwhelmingly negative, highly skewed towards very small effect sizes, and frequent enough to suggest that deleterious hitchhikers may impose a significant burden on evolving lineages. By using lines that accumulated mutations in either wild-type or slippage repair-defective backgrounds, we further disentangle the effects of two common types of mutations, single-nucleotide substitutions and simple sequence repeat indels, and show that they have distinct effects on yeast growth rate. Although the average effect of a simple sequence repeat mutation is very small (~0.3%), many do alter growth rate, implying that this class of frequent mutations has an important evolutionary impact.
Collapse
|
13
|
Hiatt L, Weisburd B, Dolzhenko E, VanNoy GE, Kurtas EN, Rehm HL, Quinlan A, Dashnow H. STRchive: a dynamic resource detailing population-level and locus-specific insights at tandem repeat disease loci. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.21.24307682. [PMID: 38826469 PMCID: PMC11142282 DOI: 10.1101/2024.05.21.24307682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Approximately 3% of the human genome consists of repetitive elements called tandem repeats (TRs), which include short tandem repeats (STRs) of 1-6bp motifs and variable number tandem repeats (VNTRs) of 7+bp motifs. TR variants contribute to several dozen mono- and polygenic diseases but remain understudied and "enigmatic," particularly relative to single nucleotide variants. It remains comparatively challenging to interpret the clinical significance of TR variants. Although existing resources provide portions of necessary data for interpretation at disease-associated loci, it is currently difficult or impossible to efficiently invoke the additional details critical to proper interpretation, such as motif pathogenicity, disease penetrance, and age of onset distributions. It is also often unclear how to apply population information to analyses. We present STRchive (S-T-archive, http://strchive.org/ ), a dynamic resource consolidating information on TR disease loci in humans from research literature, up-to-date clinical resources, and large-scale genomic databases, with the goal of streamlining TR variant interpretation at disease-associated loci. We apply STRchive -including pathogenic thresholds, motif classification, and clinical phenotypes-to a gnomAD cohort of ∼18.5k individuals genotyped at 60 disease-associated loci. Through detailed literature curation, we demonstrate that the majority of TR diseases affect children despite being thought of as adult diseases. Additionally, we show that pathogenic genotypes can be found within gnomAD which do not necessarily overlap with known disease prevalence, and leverage STRchive to interpret locus-specific findings therein. We apply a diagnostic blueprint empowered by STRchive to relevant clinical vignettes, highlighting possible pitfalls in TR variant interpretation. As a living resource, STRchive is maintained by experts, takes community contributions, and will evolve as understanding of TR diseases progresses.
Collapse
|
14
|
Ohlei O, Paul K, Nielsen SS, Gmelin D, Dobricic V, Altmann V, Schilling M, Bronstein JM, Franke A, Wittig M, Parkkinen L, Hansen J, Checkoway H, Ritz B, Bertram L, Lill CM. Genome-wide meta-analysis of short-tandem repeats for Parkinson's disease risk using genotype imputation. Brain Commun 2024; 6:fcae146. [PMID: 38863574 PMCID: PMC11166220 DOI: 10.1093/braincomms/fcae146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/20/2024] [Accepted: 02/26/2024] [Indexed: 06/13/2024] Open
Abstract
Idiopathic Parkinson's disease is determined by a combination of genetic and environmental factors. Recently, the first genome-wide association study on short-tandem repeats in Parkinson's disease reported on eight suggestive short-tandem repeat-based risk loci (α = 5.3 × 10-6), of which four were novel, i.e. they had not been implicated in Parkinson's disease risk by genome-wide association analyses of single-nucleotide polymorphisms before. Here, we tested these eight candidate short-tandem repeats in a large, independent Parkinson's disease case-control dataset (n = 4757). Furthermore, we combined the results from both studies by meta-analysis resulting in the largest Parkinson's disease genome-wide association study of short-tandem repeats to date (n = 43 844). Lastly, we investigated whether leading short-tandem repeat risk variants exert functional effects on gene expression regulation based on methylation quantitative trait locus data in human 'post-mortem' brain (n = 142). None of the eight previously reported short-tandem repeats were significantly associated with Parkinson's disease in our independent dataset after multiple testing correction (α = 6.25 × 10-3). However, we observed modest support for short-tandem repeats near CCAR2 and NCOR1 in the updated meta-analyses of all available data. While the genome-wide meta-analysis did not reveal additional study-wide significant (α = 6.3 × 10-7) short-tandem repeat signals, we identified seven novel suggestive Parkinson's disease short-tandem repeat risk loci (α = 5.3 × 10-6). Of these, especially a short-tandem repeat near MEIOSIN showed consistent evidence for association across datasets. CCAR2, NCOR1 and one novel suggestive locus identified here (LINC01012) emerged from colocalization analyses showing evidence for a shared causal short-tandem repeat variant affecting both Parkinson's disease risk and cis DNA methylation in brain. Larger studies, ideally using short-tandem repeats called from whole-sequencing data, are needed to more fully investigate their role in Parkinson's disease.
Collapse
Affiliation(s)
- Olena Ohlei
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
| | - Kimberly Paul
- Department of Neurology, UCLA David Geffen School of Medicine, Los Angeles, CA 90095, USA
| | - Susan Searles Nielsen
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - David Gmelin
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
| | - Valerija Dobricic
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
| | - Vivian Altmann
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
- Forensic Genetics Division, Instituto-Geral de Perícias do Rio Grande do Sul, Porto Alegre, RS 90230-010, Brazil
| | - Marcel Schilling
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
- Gene Regulation of Cell Identity, Regenerative Medicine Program, Bellvitge Institute for Biomedical Research (IDIBELL), L’Hospitalet del Llobregat, Barcelona 0890x, Spain
- Department of Epidemiology, University of California Los Angeles (UCLA), Fielding School of Public Health, Los Angeles, CA 90095, USA
| | - Jeff M Bronstein
- Department of Neurology, UCLA David Geffen School of Medicine, Los Angeles, CA 90095, USA
- Brain Research Institute, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, 24105 Kiel, Germany
| | - Michael Wittig
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, 24105 Kiel, Germany
| | - Laura Parkkinen
- Nuffield Department of Clinical Neurosciences, Oxford Parkinson’s Disease Centre, University of Oxford, Oxford OX1 3PT, UK
| | - Johnni Hansen
- Danish Cancer Institute, Danish Cancer Society, 2100 Copenhagen, Denmark
| | - Harvey Checkoway
- Herbert Wertheim School of Public Health, University of California San Diego, La Jolla, CA 92093, USA
| | - Beate Ritz
- Department of Neurology, UCLA David Geffen School of Medicine, Los Angeles, CA 90095, USA
- Brain Research Institute, University of California Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Lars Bertram
- Lübeck Interdisciplinary Platform for Genome Analytics (LIGA), University of Lübeck, 23562 Lübeck, Germany
| | - Christina M Lill
- Institute of Epidemiology and Social Medicine, University of Münster, 48149 Münster, Germany
- School of Public Health, Imperial College, Ageing Epidemiology Research Unit, London SW71, UK
| |
Collapse
|
15
|
Oketch JW, Wain LV, Hollox EJ. A comparison of software for analysis of rare and common short tandem repeat (STR) variation using human genome sequences from clinical and population-based samples. PLoS One 2024; 19:e0300545. [PMID: 38558075 PMCID: PMC10984476 DOI: 10.1371/journal.pone.0300545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open
Abstract
Short tandem repeat (STR) variation is an often overlooked source of variation between genomes. STRs comprise about 3% of the human genome and are highly polymorphic. Some cause Mendelian disease, and others affect gene expression. Their contribution to common disease is not well-understood, but recent software tools designed to genotype STRs using short read sequencing data will help address this. Here, we compare software that genotypes common STRs and rarer STR expansions genome-wide, with the aim of applying them to population-scale genomes. By using the Genome-In-A-Bottle (GIAB) consortium and 1000 Genomes Project short-read sequencing data, we compare performance in terms of sequence length, depth, computing resources needed, genotyping accuracy and number of STRs genotyped. To ensure broad applicability of our findings, we also measure genotyping performance against a set of genomes from clinical samples with known STR expansions, and a set of STRs commonly used for forensic identification. We find that HipSTR, ExpansionHunter and GangSTR perform well in genotyping common STRs, including the CODIS 13 core STRs used for forensic analysis. GangSTR and ExpansionHunter outperform HipSTR for genotyping call rate and memory usage. ExpansionHunter denovo (EHdn), STRling and GangSTR outperformed STRetch for detecting expanded STRs, and EHdn and STRling used considerably less processor time compared to GangSTR. Analysis on shared genomic sequence data provided by the GIAB consortium allows future performance comparisons of new software approaches on a common set of data, facilitating comparisons and allowing researchers to choose the best software that fulfils their needs.
Collapse
Affiliation(s)
- John W. Oketch
- Department of Genetics and Genome Biology, University of Leicester, Leicester, United Kingdom
| | - Louise V. Wain
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
| | - Edward J. Hollox
- Department of Genetics and Genome Biology, University of Leicester, Leicester, United Kingdom
| |
Collapse
|
16
|
Berumen J, Orozco L, Gallardo-Rincón H, Juárez-Torres E, Barrera E, Cruz-López M, Benuto RE, Ramos-Martinez E, Marin-Madina M, Alvarado-Silva A, Valladares-Salgado A, Peralta-Romero JJ, García-Ortiz H, Martinez-Juarez LA, Montoya A, Alvarez-Hernández DA, Alegre-Diaz J, Kuri-Morales P, Tapia-Conyer R. Association of tyrosine hydroxylase 01 (TH01) microsatellite and insulin gene (INS) variable number of tandem repeat (VNTR) with type 2 diabetes and fasting insulin secretion in Mexican population. J Endocrinol Invest 2024; 47:571-583. [PMID: 37624484 PMCID: PMC10904573 DOI: 10.1007/s40618-023-02175-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 08/05/2023] [Indexed: 08/26/2023]
Abstract
PURPOSE A variable number of tandem repeats (VNTR) in the insulin gene (INS) control region may be involved in type 2 diabetes (T2D). The TH01 microsatellite is near INS and may regulate it. We investigated whether the TH01 microsatellite and INS VNTR, assessed via the surrogate marker single nucleotide polymorphism rs689, are associated with T2D and serum insulin levels in a Mexican population. METHODS We analyzed a main case-control study (n = 1986) that used univariate and multivariate logistic regression models to calculate the risk conferred by TH01 and rs689 loci for T2D development; rs689 results were replicated in other case-control (n = 1188) and cross-sectional (n = 1914) studies. RESULTS TH01 alleles 6, 8, 9, and 9.3 and allele A of rs689 were independently associated with T2D, with differences between sex and age at diagnosis. TH01 alleles with ≥ 8 repeats conferred an increased risk for T2D in males compared with ≤ 7 repeats (odds ratio, ≥ 1.46; 95% confidence interval, 1.1-1.95). In females, larger alleles conferred a 1.5-fold higher risk for T2D when diagnosed ≥ 46 years but conferred protection when diagnosed ≤ 45 years. Similarly, rs689 allele A was associated with T2D in these groups. In males, larger TH01 alleles and the rs689 A allele were associated with a significant decrease in median fasting plasma insulin concentration with age in T2D cases; the reverse occurred in controls. CONCLUSION Larger TH01 alleles and rs689 A allele may potentiate insulin synthesis in males without T2D, a process disabled in those with T2D.
Collapse
Affiliation(s)
- J Berumen
- Facultad de Medicina, Unidad de Investigación en Medicina Experimental, Universidad Nacional Autónoma de México, 06720, Mexico City, México.
| | - L Orozco
- Laboratorio de Inmunogenómica y Enfermedades Metabólicas, Instituto Nacional de Medicina Genómica, Secretaria de Salud, 14610, Mexico City, México
| | - H Gallardo-Rincón
- Departamento de Soluciones Operativas, Fundación Carlos Slim, 11529, Mexico City, Mexico.
- Centro Universitario de Ciencias de la Salud, Universidad de Guadalajara, Sierra Mojada 950, 44340, Guadalajara, Jalisco, México.
| | - E Juárez-Torres
- Laboratorio Huella Génica, Unidad de Diabetes, 06600, Mexico City, Mexico
| | - E Barrera
- Facultad de Medicina, Unidad de Investigación en Medicina Experimental, Universidad Nacional Autónoma de México, 06720, Mexico City, México
| | - M Cruz-López
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, 06720, Mexico City, México
| | - R E Benuto
- Laboratorio Huella Génica, Unidad de Diabetes, 06600, Mexico City, Mexico
| | - E Ramos-Martinez
- Facultad de Medicina, Unidad de Investigación en Medicina Experimental, Universidad Nacional Autónoma de México, 06720, Mexico City, México
| | - M Marin-Madina
- Laboratorio Huella Génica, Unidad de Diabetes, 06600, Mexico City, Mexico
| | - A Alvarado-Silva
- Laboratorio Huella Génica, Unidad de Diabetes, 06600, Mexico City, Mexico
| | - A Valladares-Salgado
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, 06720, Mexico City, México
| | - J J Peralta-Romero
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, 06720, Mexico City, México
| | - H García-Ortiz
- Laboratorio de Inmunogenómica y Enfermedades Metabólicas, Instituto Nacional de Medicina Genómica, Secretaria de Salud, 14610, Mexico City, México
| | - L A Martinez-Juarez
- Departamento de Soluciones Operativas, Fundación Carlos Slim, 11529, Mexico City, Mexico
- Center for Humanitarian Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - A Montoya
- Departamento de Soluciones Operativas, Fundación Carlos Slim, 11529, Mexico City, Mexico
| | - D A Alvarez-Hernández
- Departamento de Soluciones Operativas, Fundación Carlos Slim, 11529, Mexico City, Mexico
| | - J Alegre-Diaz
- Facultad de Medicina, Unidad de Investigación en Medicina Experimental, Universidad Nacional Autónoma de México, 06720, Mexico City, México
| | - P Kuri-Morales
- Proyecto OriGen, Instituto Tecnologico y de Estudios Superiores de Monterrey, Monterrey, México
| | - R Tapia-Conyer
- Facultad de Medicina, Universidad Nacional Autónoma de México, Coyoacán, 04510, Mexico City, México
| |
Collapse
|
17
|
Anderson R, Das MR, Chang Y, Farenhem K, Schmitz CO, Jain A. CAG repeat expansions create splicing acceptor sites and produce aberrant repeat-containing RNAs. Mol Cell 2024; 84:702-714.e10. [PMID: 38295802 PMCID: PMC10923110 DOI: 10.1016/j.molcel.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 11/07/2023] [Accepted: 01/08/2024] [Indexed: 02/04/2024]
Abstract
Expansions of CAG trinucleotide repeats cause several rare neurodegenerative diseases. The disease-causing repeats are translated in multiple reading frames and without an identifiable initiation codon. The molecular mechanism of this repeat-associated non-AUG (RAN) translation is not known. We find that expanded CAG repeats create new splice acceptor sites. Splicing of proximal donors to the repeats produces unexpected repeat-containing transcripts. Upon splicing, depending on the sequences surrounding the donor, CAG repeats may become embedded in AUG-initiated open reading frames. Canonical AUG-initiated translation of these aberrant RNAs may account for proteins that have been attributed to RAN translation. Disruption of the relevant splice donors or the in-frame AUG initiation codons is sufficient to abrogate RAN translation. Our findings provide a molecular explanation for the abnormal translation products observed in CAG trinucleotide repeat expansion disorders and add to the repertoire of mechanisms by which repeat expansion mutations disrupt cellular functions.
Collapse
Affiliation(s)
- Rachel Anderson
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Michael R Das
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Yeonji Chang
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Kelsey Farenhem
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Cameron O Schmitz
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Ankur Jain
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA.
| |
Collapse
|
18
|
Bayat H, Mirahmadi M, Azarshin Z, Ohadi H, Delbari A, Ohadi M. CRISPR/Cas9-mediated deletion of a GA-repeat in human GPM6B leads to disruption of neural cell differentiation from NT2 cells. Sci Rep 2024; 14:2136. [PMID: 38273037 PMCID: PMC10810867 DOI: 10.1038/s41598-024-52675-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 01/22/2024] [Indexed: 01/27/2024] Open
Abstract
The human neuron-specific gene, GPM6B (Glycoprotein membrane 6B), is considered a key gene in neural cell functionality. This gene contains an exceptionally long and strictly monomorphic short tandem repeat (STR) of 9-repeats, (GA)9. STRs in regulatory regions, may impact on the expression of nearby genes. We used CRISPR-based tool to delete this GA-repeat in NT2 cells, and analyzed the consequence of this deletion on GPM6B expression. Subsequently, the edited cells were induced to differentiate into neural cells, using retinoic acid (RA) treatment. Deletion of the GA-repeat significantly decreased the expression of GPM6B at the RNA (p < 0.05) and protein (40%) levels. Compared to the control cells, the edited cells showed dramatic decrease of the astrocyte and neural cell markers, including GFAP (0.77-fold), TUBB3 (0.57-fold), and MAP2 (0.2-fold). Subsequent sorting of the edited cells showed an increased number of NES (p < 0.01), but a decreased number of GFAP (p < 0.001), TUBB3 (p < 0.05), and MAP2 (p < 0.01), compared to the control cells. In conclusion, CRISPR/Cas9-mediated deletion of a GA-repeat in human GPM6B, led to decreased expression of this gene, which in turn, disrupted differentiation of NT2 cells into neural cells.
Collapse
Affiliation(s)
- Hadi Bayat
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
| | - Maryam Mirahmadi
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
- Department of Exomine, PardisGene Company, Tehran, Postal Code: 1917635816, Iran
| | - Zohreh Azarshin
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Postal Box: 331-14115, Tehran, Iran
| | - Hamid Ohadi
- School of Physics and Astronomy, University of St Andrews, St Andrews, KY16 9SS, UK
| | - Ahmad Delbari
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran
| | - Mina Ohadi
- Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Postal Code: 1985713834, Iran.
| |
Collapse
|
19
|
Hong EP, Ramos EM, Aziz NA, Massey TH, McAllister B, Lobanov S, Jones L, Holmans P, Kwak S, Orth M, Ciosi M, Lomeikaite V, Monckton DG, Long JD, Lucente D, Wheeler VC, Gillis T, MacDonald ME, Sequeiros J, Gusella JF, Lee JM. Modification of Huntington's disease by short tandem repeats. Brain Commun 2024; 6:fcae016. [PMID: 38449714 PMCID: PMC10917446 DOI: 10.1093/braincomms/fcae016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/20/2023] [Accepted: 01/22/2024] [Indexed: 03/08/2024] Open
Abstract
Expansions of glutamine-coding CAG trinucleotide repeats cause a number of neurodegenerative diseases, including Huntington's disease and several of spinocerebellar ataxias. In general, age-at-onset of the polyglutamine diseases is inversely correlated with the size of the respective inherited expanded CAG repeat. Expanded CAG repeats are also somatically unstable in certain tissues, and age-at-onset of Huntington's disease corrected for individual HTT CAG repeat length (i.e. residual age-at-onset), is modified by repeat instability-related DNA maintenance/repair genes as demonstrated by recent genome-wide association studies. Modification of one polyglutamine disease (e.g. Huntington's disease) by the repeat length of another (e.g. ATXN3, CAG expansions in which cause spinocerebellar ataxia 3) has also been hypothesized. Consequently, we determined whether age-at-onset in Huntington's disease is modified by the CAG repeats of other polyglutamine disease genes. We found that the CAG measured repeat sizes of other polyglutamine disease genes that were polymorphic in Huntington's disease participants but did not influence Huntington's disease age-at-onset. Additional analysis focusing specifically on ATXN3 in a larger sample set (n = 1388) confirmed the lack of association between Huntington's disease residual age-at-onset and ATXN3 CAG repeat length. Additionally, neither our Huntington's disease onset modifier genome-wide association studies single nucleotide polymorphism data nor imputed short tandem repeat data supported the involvement of other polyglutamine disease genes in modifying Huntington's disease. By contrast, our genome-wide association studies based on imputed short tandem repeats revealed significant modification signals for other genomic regions. Together, our short tandem repeat genome-wide association studies show that modification of Huntington's disease is associated with short tandem repeats that do not involve other polyglutamine disease-causing genes, refining the landscape of Huntington's disease modification and highlighting the importance of rigorous data analysis, especially in genetic studies testing candidate modifiers.
Collapse
Affiliation(s)
- Eun Pyo Hong
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Eliana Marisa Ramos
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - N Ahmad Aziz
- Population & Clinical Neuroepidemiology, German Center for Neurodegenerative Diseases, 53127 Bonn, Germany
- Department of Neurology, Faculty of Medicine, University of Bonn, Bonn D-53113, Germany
| | - Thomas H Massey
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Branduff McAllister
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Sergey Lobanov
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Lesley Jones
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Peter Holmans
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Seung Kwak
- Molecular System Biology, CHDI Foundation, Princeton, NJ 08540, USA
| | - Michael Orth
- University Hospital of Old Age Psychiatry and Psychotherapy, Bern University, CH-3000 Bern 60, Switzerland
| | - Marc Ciosi
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Vilija Lomeikaite
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Darren G Monckton
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Jeffrey D Long
- Department of Psychiatry, Carver College of Medicine and Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA 52242, USA
| | - Diane Lucente
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Vanessa C Wheeler
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Tammy Gillis
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Marcy E MacDonald
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Jorge Sequeiros
- UnIGENe, IBMC—Institute for Molecular and Cell Biology, i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto 420-135, Portugal
- ICBAS School of Medicine and Biomedical Sciences, University of Porto, Porto 420-135, Portugal
| | - James F Gusella
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Jong-Min Lee
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
20
|
Birnbaum R. Rediscovering tandem repeat variation in schizophrenia: challenges and opportunities. Transl Psychiatry 2023; 13:402. [PMID: 38123544 PMCID: PMC10733427 DOI: 10.1038/s41398-023-02689-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/23/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open
Abstract
Tandem repeats (TRs) are prevalent throughout the genome, constituting at least 3% of the genome, and often highly polymorphic. The high mutation rate of TRs, which can be orders of magnitude higher than single-nucleotide polymorphisms and indels, indicates that they are likely to make significant contributions to phenotypic variation, yet their contribution to schizophrenia has been largely ignored by recent genome-wide association studies (GWAS). Tandem repeat expansions are already known causative factors for over 50 disorders, while common tandem repeat variation is increasingly being identified as significantly associated with complex disease and gene regulation. The current review summarizes key background concepts of tandem repeat variation as pertains to disease risk, elucidating their potential for schizophrenia association. An overview of next-generation sequencing-based methods that may be applied for TR genome-wide identification is provided, and some key methodological challenges in TR analyses are delineated.
Collapse
Affiliation(s)
- Rebecca Birnbaum
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
21
|
Rajagopal S, Donaldson J, Flower M, Hensman Moss DJ, Tabrizi SJ. Genetic modifiers of repeat expansion disorders. Emerg Top Life Sci 2023; 7:325-337. [PMID: 37861103 PMCID: PMC10754329 DOI: 10.1042/etls20230015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 09/20/2023] [Accepted: 10/09/2023] [Indexed: 10/21/2023]
Abstract
Repeat expansion disorders (REDs) are monogenic diseases caused by a sequence of repetitive DNA expanding above a pathogenic threshold. A common feature of the REDs is a strong genotype-phenotype correlation in which a major determinant of age at onset (AAO) and disease progression is the length of the inherited repeat tract. Over a disease-gene carrier's life, the length of the repeat can expand in somatic cells, through the process of somatic expansion which is hypothesised to drive disease progression. Despite being monogenic, individual REDs are phenotypically variable, and exploring what genetic modifying factors drive this phenotypic variability has illuminated key pathogenic mechanisms that are common to this group of diseases. Disease phenotypes are affected by the cognate gene in which the expansion is found, the location of the repeat sequence in coding or non-coding regions and by the presence of repeat sequence interruptions. Human genetic data, mouse models and in vitro models have implicated the disease-modifying effect of DNA repair pathways via the mechanisms of somatic mutation of the repeat tract. As such, developing an understanding of these pathways in the context of expanded repeats could lead to future disease-modifying therapies for REDs.
Collapse
Affiliation(s)
- Sangeerthana Rajagopal
- UCL Huntington's Disease Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London WC1N 3BG, U.K
- UK Dementia Research Institute, University College London, London WCC1N 3BG, U.K
| | - Jasmine Donaldson
- UCL Huntington's Disease Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London WC1N 3BG, U.K
- UK Dementia Research Institute, University College London, London WCC1N 3BG, U.K
| | - Michael Flower
- UCL Huntington's Disease Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London WC1N 3BG, U.K
- UK Dementia Research Institute, University College London, London WCC1N 3BG, U.K
| | - Davina J Hensman Moss
- UCL Huntington's Disease Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London WC1N 3BG, U.K
- UK Dementia Research Institute, University College London, London WCC1N 3BG, U.K
- St George's University of London, London SW17 0RE, U.K
| | - Sarah J Tabrizi
- UCL Huntington's Disease Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London WC1N 3BG, U.K
- UK Dementia Research Institute, University College London, London WCC1N 3BG, U.K
| |
Collapse
|
22
|
Hannan AJ. Expanding horizons of tandem repeats in biology and medicine: Why 'genomic dark matter' matters. Emerg Top Life Sci 2023; 7:ETLS20230075. [PMID: 38088823 PMCID: PMC10754335 DOI: 10.1042/etls20230075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023]
Abstract
Approximately half of the human genome includes repetitive sequences, and these DNA sequences (as well as their transcribed repetitive RNA and translated amino-acid repeat sequences) are known as the repeatome. Within this repeatome there are a couple of million tandem repeats, dispersed throughout the genome. These tandem repeats have been estimated to constitute ∼8% of the entire human genome. These tandem repeats can be located throughout exons, introns and intergenic regions, thus potentially affecting the structure and function of tandemly repetitive DNA, RNA and protein sequences. Over more than three decades, more than 60 monogenic human disorders have been found to be caused by tandem-repeat mutations. These monogenic tandem-repeat disorders include Huntington's disease, a variety of ataxias, amyotrophic lateral sclerosis and frontotemporal dementia, as well as many other neurodegenerative diseases. Furthermore, tandem-repeat disorders can include fragile X syndrome, related fragile X disorders, as well as other neurological and psychiatric disorders. However, these monogenic tandem-repeat disorders, which were discovered via their dominant or recessive modes of inheritance, may represent the 'tip of the iceberg' with respect to tandem-repeat contributions to human disorders. A previous proposal that tandem repeats may contribute to the 'missing heritability' of various common polygenic human disorders has recently been supported by a variety of new evidence. This includes genome-wide studies that associate tandem-repeat mutations with autism, schizophrenia, Parkinson's disease and various types of cancers. In this article, I will discuss how tandem-repeat mutations and polymorphisms could contribute to a wide range of common disorders, along with some of the many major challenges of tandem-repeat biology and medicine. Finally, I will discuss the potential of tandem repeats to be therapeutically targeted, so as to prevent and treat an expanding range of human disorders.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Victoria 3010, Australia
- Department of Anatomy and Physiology, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
23
|
Bhati M, Mapel XM, Lloret-Villas A, Pausch H. Structural variants and short tandem repeats impact gene expression and splicing in bovine testis tissue. Genetics 2023; 225:iyad161. [PMID: 37655920 PMCID: PMC10627265 DOI: 10.1093/genetics/iyad161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/05/2023] [Accepted: 08/24/2023] [Indexed: 09/02/2023] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) are significant sources of genetic variation. However, the impacts of these variants on gene regulation have not been investigated in cattle. Here, we genotyped and characterized 19,408 SVs and 374,821 STRs in 183 bovine genomes and investigated their impact on molecular phenotypes derived from testis transcriptomes. We found that 71% STRs were multiallelic. The vast majority (95%) of STRs and SVs were in intergenic and intronic regions. Only 37% SVs and 40% STRs were in high linkage disequilibrium (LD) (R2 > 0.8) with surrounding SNPs/insertions and deletions (Indels), indicating that SNP-based association testing and genomic prediction are blind to a nonnegligible portion of genetic variation. We showed that both SVs and STRs were more than 2-fold enriched among expression and splicing QTL (e/sQTL) relative to SNPs/Indels and were often associated with differential expression and splicing of multiple genes. Deletions and duplications had larger impacts on splicing and expression than any other type of SV. Exonic duplications predominantly increased gene expression either through alternative splicing or other mechanisms, whereas expression- and splicing-associated STRs primarily resided in intronic regions and exhibited bimodal effects on the molecular phenotypes investigated. Most e/sQTL resided within 100 kb of the affected genes or splicing junctions. We pinpoint candidate causal STRs and SVs associated with the expression of SLC13A4 and TTC7B and alternative splicing of a lncRNA and CAPP1. We provide a catalog of STRs and SVs for taurine cattle and show that these variants contribute substantially to gene expression and splicing variation.
Collapse
Affiliation(s)
- Meenu Bhati
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | - Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| | | | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland
| |
Collapse
|
24
|
Link V, Zavaleta YJA, Reyes RJ, Ding L, Wang J, Rohlfs RV, Edge MD. Microsatellites used in forensics are in regions enriched for trait-associated variants. iScience 2023; 26:107992. [PMID: 37841589 PMCID: PMC10570123 DOI: 10.1016/j.isci.2023.107992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 08/10/2023] [Accepted: 09/18/2023] [Indexed: 10/17/2023] Open
Abstract
The 20 short tandem repeat (STR) loci of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS loci are thought to contain little information about ancestry or traits. However, in the past 20 years, a growing field has identified hundreds of thousands of genotype-trait associations. Here, we conduct a survey of the landscape of such associations surrounding the CODIS loci as compared with non-CODIS STRs. Although this study cannot establish or quantify associations between CODIS genotypes and phenotypes, we find that the regions around the CODIS loci are enriched for both known pathogenic variants (> 90th percentile) and for trait-associated SNPs identified in genome-wide association studies (GWAS) (≥ 95th percentile in 10kb and 100kb flanking regions), compared with other random sets of autosomal tetranucleotide-repeat STRs.
Collapse
Affiliation(s)
- Vivian Link
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | | | - Rochelle-Jan Reyes
- Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Linda Ding
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Judy Wang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Rori V. Rohlfs
- Department of Biology, San Francisco State University, San Francisco, CA, USA
- Department of Data Science and Institute of Ecology and Evolution, University of Oregon, Eugene, OR, USA
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
25
|
Anderson R, Das M, Chang Y, Farenhem K, Jain A. CAG repeat expansions create splicing acceptor sites and produce aberrant repeat-containing RNAs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.16.562581. [PMID: 37904984 PMCID: PMC10614865 DOI: 10.1101/2023.10.16.562581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Expansions of CAG trinucleotide repeats cause several rare neurodegenerative diseases. The disease-causing repeats are translated in multiple reading frames, without an identifiable initiation codon. The molecular mechanism of this repeat-associated non-AUG (RAN) translation is not known. We find that expanded CAG repeats create new splice acceptor sites. Splicing of proximal donors to the repeats produces unexpected repeat-containing transcripts. Upon splicing, depending on the sequences surrounding the donor, CAG repeats may become embedded in AUG-initiated open reading frames. Canonical AUG-initiated translation of these aberrant RNAs accounts for proteins that are attributed to RAN translation. Disruption of the relevant splice donors or the in-frame AUG initiation codons is sufficient to abrogate RAN translation. Our findings provide a molecular explanation for the abnormal translation products observed in CAG trinucleotide repeat expansion disorders and add to the repertoire of mechanisms by which repeat expansion mutations disrupt cellular functions.
Collapse
Affiliation(s)
- Rachel Anderson
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Michael Das
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Yeonji Chang
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Kelsey Farenhem
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| | - Ankur Jain
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, 31 Ames Street, Cambridge, MA 02139, USA
| |
Collapse
|
26
|
Wang N, Khan S, Elo LL. VarSCAT: A computational tool for sequence context annotations of genomic variants. PLoS Comput Biol 2023; 19:e1010727. [PMID: 37566612 PMCID: PMC10446208 DOI: 10.1371/journal.pcbi.1010727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 08/23/2023] [Accepted: 07/20/2023] [Indexed: 08/13/2023] Open
Abstract
The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT.
Collapse
Affiliation(s)
- Ning Wang
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
| | - Sofia Khan
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Laura L. Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
- Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
27
|
Liu Y, Li J, Wu Q. Short Tandem Repeats of Human Genome Are Intrinsically Unstable in Cultured Cells in vivo. Gene 2023:147539. [PMID: 37279866 DOI: 10.1016/j.gene.2023.147539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 05/18/2023] [Accepted: 06/02/2023] [Indexed: 06/08/2023]
Abstract
Short tandem repeats (STRs) are a class of abundant structural or functional elements in the human genome and exhibit a polymorphic nature of repeat length and genetic variation within human populations. Interestingly, STR expansions underlie about 60 neurological disorders. However, "stutter" artifacts or noises render it difficult to investigate the pathogenesis of STR expansions. Here, we systematically investigated STR instability in cultured human cells using GC-rich CAG and AT-rich ATTCT tandem repeats as examples. We found that triplicate bidirectional Sanger sequencing with PCR amplification under proper conditions can reliably assess STR length. In addition, we found that next-generation sequencing with paired-end reads bidirectionally covering STR regions can accurately and reliably assay STR length. Finally, we found that STRs are intrinsically unstable in cultured human cell populations and during single-cell cloning. Our data suggest a general method for accurately and reliably assessing STR length and have important implications in investigating pathogenesis of STR expansion diseases.
Collapse
Affiliation(s)
- Yuzhe Liu
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratory, Shanghai 201203, China
| | - Jinhuan Li
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratory, Shanghai 201203, China
| | - Qiang Wu
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Oncogenes and Related Genes, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratory, Shanghai 201203, China.
| |
Collapse
|
28
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
29
|
Zhang G, Andersen EC. Interplay Between Polymorphic Short Tandem Repeats and Gene Expression Variation in Caenorhabditis elegans. Mol Biol Evol 2023; 40:msad067. [PMID: 36999565 PMCID: PMC10075192 DOI: 10.1093/molbev/msad067] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/20/2023] [Accepted: 03/29/2023] [Indexed: 04/01/2023] Open
Abstract
Short tandem repeats (STRs) have orders of magnitude higher mutation rates than single nucleotide variants (SNVs) and have been proposed to accelerate evolution in many organisms. However, only few studies have addressed the impact of STR variation on phenotypic variation at both the organismal and molecular levels. Potential driving forces underlying the high mutation rates of STRs also remain largely unknown. Here, we leverage the recently generated expression and STR variation data among wild Caenorhabditis elegans strains to conduct a genome-wide analysis of how STRs affect gene expression variation. We identify thousands of expression STRs (eSTRs) showing regulatory effects and demonstrate that they explain missing heritability beyond SNV-based expression quantitative trait loci. We illustrate specific regulatory mechanisms such as how eSTRs affect splicing sites and alternative splicing efficiency. We also show that differential expression of antioxidant genes and oxidative stresses might affect STR mutations systematically using both wild strains and mutation accumulation lines. Overall, we reveal the interplay between STRs and gene expression variation by providing novel insights into regulatory mechanisms of STRs and highlighting that oxidative stress could lead to higher STR mutation rates.
Collapse
Affiliation(s)
- Gaotian Zhang
- Department of Molecular Biosciences, Northwestern University, Evanston, IL
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL
| |
Collapse
|
30
|
Wright SE, Todd PK. Native functions of short tandem repeats. eLife 2023; 12:e84043. [PMID: 36940239 PMCID: PMC10027321 DOI: 10.7554/elife.84043] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 03/08/2023] [Indexed: 03/21/2023] Open
Abstract
Over a third of the human genome is comprised of repetitive sequences, including more than a million short tandem repeats (STRs). While studies of the pathologic consequences of repeat expansions that cause syndromic human diseases are extensive, the potential native functions of STRs are often ignored. Here, we summarize a growing body of research into the normal biological functions for repetitive elements across the genome, with a particular focus on the roles of STRs in regulating gene expression. We propose reconceptualizing the pathogenic consequences of repeat expansions as aberrancies in normal gene regulation. From this altered viewpoint, we predict that future work will reveal broader roles for STRs in neuronal function and as risk alleles for more common human neurological diseases.
Collapse
Affiliation(s)
- Shannon E Wright
- Department of Neurology, University of Michigan–Ann ArborAnn ArborUnited States
- Neuroscience Graduate Program, University of Michigan–Ann ArborAnn ArborUnited States
- Department of Neuroscience, Picower InstituteCambridgeUnited States
| | - Peter K Todd
- Department of Neurology, University of Michigan–Ann ArborAnn ArborUnited States
- VA Ann Arbor Healthcare SystemAnn ArborUnited States
| |
Collapse
|
31
|
Behboudi R, Nouri-Baygi M, Naghibzadeh M. RPTRF: A rapid perfect tandem repeat finder tool for DNA sequences. Biosystems 2023; 226:104869. [PMID: 36858110 DOI: 10.1016/j.biosystems.2023.104869] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 01/23/2023] [Accepted: 02/23/2023] [Indexed: 03/02/2023]
Abstract
The sequencing of eukaryotic genomes has shown that tandem repeats are abundant in their sequences. In addition to affecting some cellular processes, tandem repeats in the genome may be associated with specific diseases and have been the key to resolving criminal cases. Any tool developed for detecting tandem repeats must be accurate, fast, and useable in thousands of laboratories worldwide, including those with not very advanced computing capabilities. The proposed method, the Rapid Perfect Tandem Repeat Finder (RPTRF), minimizes the need for excess character comparison processing by indexing the input file and significantly helps to accelerate and prepare the output without artifacts by using an interval tree in the filtering section. The experiments demonstrated that the RPTRF is very fast in discovering all perfect tandem repeats of all categories of any genomic sequences. Although the detection of imperfect TRs is not the focus of the RPTRF, comparisons show that it even outperforms some other tools (in five selected gold standards) designed explicitly for this purpose. The implemented tool and how to use it are available on GitHub.
Collapse
Affiliation(s)
- Reza Behboudi
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mostafa Nouri-Baygi
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
32
|
Malekshoar M, Azimi SA, Kaki A, Mousazadeh L, Motaei J, Vatankhah M. CRISPR-Cas9 Targeted Enrichment and Next-Generation Sequencing for Mutation Detection. J Mol Diagn 2023; 25:249-262. [PMID: 36841425 DOI: 10.1016/j.jmoldx.2023.01.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 01/08/2023] [Accepted: 01/27/2023] [Indexed: 02/27/2023] Open
Abstract
Despite the rapid application of next-generation sequencing (NGS) technologies, target sequencing in regions of the genome is often required to diagnose many genetic diseases. Target enrichment can be an effective factor in reducing the cost of sequencing and the duration of sequencing. Recently, several clustered system regularly interspaced short palindromic repeats (CRISPR)-based methods (amplification-free sequencing) have been developed to target enrichment in combination with one of the NGS platforms. CRISPR-based target enrichment strategies act as an auxiliary tool to improve NGS analytical performance, thereby indirectly facilitating nucleic acid detection. The direct DNA cleavage approach by CRISPR-Cas at genome-specific sites enhances the possibility of separating native large fragments from disease-related genomic regions. The CRISPR-Cas can isolate the target region without any amplification; subsequently, long-read sequencing technologies were also implemented. These methods, as promising tools, have the ability to assess genetic and epigenetic composition for clinical application and treatment responses in cancer precision medicine. By modifying CRISPR-based enrichment protocols, it was possible to identify different types of mutations, including structural variants, short tandem repeats, fusion genes, and mobile elements. The Cas9 can specifically eliminate wild-type sequences, and it also enables the enrichment and detection of small amounts of tumor DNA fragments among the highly heterogeneous fragments of wild-type DNA.
Collapse
Affiliation(s)
- Mehrdad Malekshoar
- Anesthesiology, Critical Care and Pain Management Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
| | - Sajad Ataei Azimi
- Department of Hematology-Oncology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Arastoo Kaki
- Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Leila Mousazadeh
- Department of Medical Biotechnology, School of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Jamshid Motaei
- Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Majid Vatankhah
- Anesthesiology, Critical Care and Pain Management Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran.
| |
Collapse
|
33
|
Wang H, Wang LS, Schellenberg G, Lee WP. The role of structural variations in Alzheimer's disease and other neurodegenerative diseases. Front Aging Neurosci 2023; 14:1073905. [PMID: 36846102 PMCID: PMC9944073 DOI: 10.3389/fnagi.2022.1073905] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/31/2022] [Indexed: 02/10/2023] Open
Abstract
Dozens of single nucleotide polymorphisms (SNPs) related to Alzheimer's disease (AD) have been discovered by large scale genome-wide association studies (GWASs). However, only a small portion of the genetic component of AD can be explained by SNPs observed from GWAS. Structural variation (SV) can be a major contributor to the missing heritability of AD; while SV in AD remains largely unexplored as the accurate detection of SVs from the widely used array-based and short-read technology are still far from perfect. Here, we briefly summarized the strengths and weaknesses of available SV detection methods. We reviewed the current landscape of SV analysis in AD and SVs that have been found associated with AD. Particularly, the importance of currently less explored SVs, including insertions, inversions, short tandem repeats, and transposable elements in neurodegenerative diseases were highlighted.
Collapse
Affiliation(s)
- Hui Wang
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Li-San Wang
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Gerard Schellenberg
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Wan-Ping Lee
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
34
|
Rare tandem repeat expansions associate with genes involved in synaptic and neuronal signaling functions in schizophrenia. Mol Psychiatry 2023; 28:475-482. [PMID: 36380236 PMCID: PMC9812781 DOI: 10.1038/s41380-022-01857-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 10/14/2022] [Accepted: 10/24/2022] [Indexed: 11/17/2022]
Abstract
Tandem repeat expansions (TREs) are associated with over 60 monogenic disorders and have recently been implicated in complex disorders such as cancer and autism spectrum disorder. The role of TREs in schizophrenia is now emerging. In this study, we have performed a genome-wide investigation of TREs in schizophrenia. Using genome sequence data from 1154 Swedish schizophrenia cases and 934 ancestry-matched population controls, we have detected genome-wide rare (<0.1% population frequency) TREs that have motifs with a length of 2-20 base pairs. We find that the proportion of individuals carrying rare TREs is significantly higher in the schizophrenia group. There is a significantly higher burden of rare TREs in schizophrenia cases than in controls in genic regions, particularly in postsynaptic genes, in genes overlapping brain expression quantitative trait loci, and in brain-expressed genes that are differentially expressed between schizophrenia cases and controls. We demonstrate that TRE-associated genes are more constrained and primarily impact synaptic and neuronal signaling functions. These results have been replicated in an independent Canadian sample that consisted of 252 schizophrenia cases of European ancestry and 222 ancestry-matched controls. Our results support the involvement of rare TREs in schizophrenia etiology.
Collapse
|
35
|
Wang X, Budowle B, Ge J. USAT: a bioinformatic toolkit to facilitate interpretation and comparative visualization of tandem repeat sequences. BMC Bioinformatics 2022; 23:497. [PMID: 36402991 PMCID: PMC9675219 DOI: 10.1186/s12859-022-05021-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 10/29/2022] [Indexed: 11/21/2022] Open
Abstract
Background Tandem repeats (TR), highly variable genomic variants, are widely used in individual identification, disease diagnostics, and evolutionary studies. The recent advances in sequencing technologies and bioinformatic tools facilitate calling TR haplotypes genome widely. Both length-based and sequence-based TR alleles are used in different applications. However, sequence-based TR alleles could provide the highest precision in characterizing TR haplotypes. The need to identify the differences at the single nucleotide level between or among TR haplotypes with an easy-use bioinformatic tool is essential. Results In this study, we developed a Universal STR Allele Toolkit (USAT) for TR haplotype analysis, which takes TR haplotype output from existing tools to perform allele size conversion, sequence comparison of haplotypes, figure plotting, comparison for allele distribution, and interactive visualization. An exemplary application of USAT for analysis of the CODIS core STR loci for DNA forensics with benchmarking human individuals demonstrated the capabilities of USAT. USAT has user-friendly graphic interfaces and runs fast in major computing operating systems with parallel computing enabled. Conclusion USAT is a user-friendly bioinformatics software for interpretation, visualization, and comparisons of TRs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-05021-1.
Collapse
Affiliation(s)
- Xuewen Wang
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA
| | - Bruce Budowle
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA ,grid.266871.c0000 0000 9765 6057Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX USA
| | - Jianye Ge
- grid.266869.50000 0001 1008 957XCenter for Human Identification, Health Science Center, University of North Texas, Fort Worth, TX USA ,grid.266871.c0000 0000 9765 6057Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX USA
| |
Collapse
|
36
|
Masnovo C, Lobo AF, Mirkin SM. Replication dependent and independent mechanisms of GAA repeat instability. DNA Repair (Amst) 2022; 118:103385. [PMID: 35952488 PMCID: PMC9675320 DOI: 10.1016/j.dnarep.2022.103385] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/28/2022] [Accepted: 07/30/2022] [Indexed: 11/20/2022]
Abstract
Trinucleotide repeat instability is a driver of human disease. Large expansions of (GAA)n repeats in the first intron of the FXN gene are the cause Friedreich's ataxia (FRDA), a progressive degenerative disorder which cannot yet be prevented or treated. (GAA)n repeat instability arises during both replication-dependent processes, such as cell division and intergenerational transmission, as well as in terminally differentiated somatic tissues. Here, we provide a brief historical overview on the discovery of (GAA)n repeat expansions and their association to FRDA, followed by recent advances in the identification of triplex H-DNA formation and replication fork stalling. The main body of this review focuses on the last decade of progress in understanding the mechanism of (GAA)n repeat instability during DNA replication and/or DNA repair. We propose that the discovery of additional mechanisms of (GAA)n repeat instability can be achieved via both comparative approaches to other repeat expansion diseases and genome-wide association studies. Finally, we discuss the advances towards FRDA prevention or amelioration that specifically target (GAA)n repeat expansions.
Collapse
Affiliation(s)
- Chiara Masnovo
- Department of Biology, Tufts University, Medford, MA 02155, USA
| | - Ayesha F Lobo
- Department of Biology, Tufts University, Medford, MA 02155, USA
| | - Sergei M Mirkin
- Department of Biology, Tufts University, Medford, MA 02155, USA.
| |
Collapse
|
37
|
Establishment of a co-analysis system for personal identification and body fluid identification: a preliminary report. Int J Legal Med 2022; 136:1565-1575. [PMID: 36076078 DOI: 10.1007/s00414-022-02886-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 08/24/2022] [Indexed: 10/14/2022]
Abstract
Analysis of genetic markers can provide clues for case investigation. Short tandem repeat (STR) detection and analysis are widely used for both personal identification and parentage testing. However, DNA analysis currently cannot provide sufficient information for body fluid identification. Tissue or cell sources of samples can be identified by detecting body fluid-specific mRNA markers, which have been studied thoroughly. Integrating STR profiling and mRNA expression patterns can provide more information than conventional methods for investigations and the reconstruction of crime scenes; this can be achieved by DNA/RNA co-extraction technology, which is economical, efficient, and suitable for low-template samples. Here, we propose a co-analysis system based on the PowerPlex 16 kit. This system can simultaneously amplify 25 markers, including 15 STRs, one non-STR amelogenin, and nine mRNA markers (three blood-specific, two saliva-specific, two semen-specific, and two housekeeping gene markers). The specificity and sensitivity of the co-analysis system were determined and aged and degraded samples were used to validate the stability of the co-analysis system. Finally, different DNA/RNA ratios and various carriers were evaluated. The results showed that the DNA/RNA co-analysis system correctly identified different types of body fluid stains. The STR profiles obtained using the co-analysis system were identical to those obtained using the PP16 kit, which demonstrates that the mRNA primers used did not affect STR profiling. Complete STR and mRNA profiles could be obtained from 1/8 portions of buccal swabs, 1/16 portions of swabs of blood and semen samples, 0.1 cm2 of blood samples, 0.25 cm2 of semen samples, and 1.0 cm2 saliva samples. Additionally, our findings indicate that complete STR and mRNA profiles can be obtained with this system from blood and semen samples when the DNA/RNA ratio is 1:1/32. This study suggests that the co-analysis system could be used for simultaneous personal identification and body fluid identification.
Collapse
|
38
|
Kang YH, Hyun JE, Hwang CY. The number of mitochondrial DNA mutations as a genetic feature for hair cycle arrest (alopecia X) in Pomeranian dogs. Vet Dermatol 2022; 33:545-552. [PMID: 36000586 DOI: 10.1111/vde.13114] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/24/2022] [Accepted: 05/04/2022] [Indexed: 11/30/2022]
Abstract
BACKGROUND Hair cycle arrest (HCA) is a noninflammatory alopecic disease affecting various dog breeds, particularly Pomeranian dogs. This disease is probably a hereditary disorder considering the strong breed predisposition. Despite efforts to identify the pathogenesis of this disease, an underlying specific cause is unknown. OBJECTIVE To identify candidate gene mutations for HCA in Pomeranian dogs. ANIMALS Four Pomeranian dogs diagnosed with HCA and four unaffected Pomeranian dogs. MATERIALS AND METHODS Whole blood was used for DNA extraction. Whole-genome sequencing (WGS) was performed, and variants were analysed using a Genome Analysis Toolkit (GATK) and SnpEff. All reads were aligned to the reference genome, Dog10K_Boxer_Tasha. Sanger sequencing was performed to define the complex mutations. RESULTS A total of 113 variants of mitochondrial DNA were found to be effective gene mutations in the eight dogs. The affected dogs showed significantly increased effective mutations (average 57 variants) compared with unaffected dogs (average eight variants; p < 0.05). There was no significant difference in the number of chromosomal DNA mutations between the two groups. CONCLUSION AND CLINICAL IMPORTANCE We suggest that an increased number of mitochondrial gene mutations are features for HCA in Pomeranian dogs.
Collapse
Affiliation(s)
- Yeong-Hun Kang
- Laboratory of Veterinary Dermatology and the Research Institute for Veterinary Science, College of Veterinary Medicine, Seoul National University, Seoul, South Korea
| | - Jae-Eun Hyun
- Institute of Animal Medicine, College of Veterinary Medicine, Gyeongsang National University, Jinju, South Korea
| | - Cheol-Yong Hwang
- Laboratory of Veterinary Dermatology and the Research Institute for Veterinary Science, College of Veterinary Medicine, Seoul National University, Seoul, South Korea
| |
Collapse
|
39
|
Halman A, Dolzhenko E, Oshlack A. STRipy: A graphical application for enhanced genotyping of pathogenic short tandem repeats in sequencing data. Hum Mutat 2022; 43:859-868. [PMID: 35395114 PMCID: PMC9541159 DOI: 10.1002/humu.24382] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 12/01/2021] [Accepted: 04/06/2022] [Indexed: 11/22/2022]
Abstract
Expansions of short tandem repeats (STRs) have been implicated as the causal variant in over 50 diseases known to date. There are several tools which can genotype STRs from high-throughput sequencing (HTS) data. However, running these tools out of the box only allows around half of the known disease-causing loci to be genotyped. Furthermore, the genotypes estimated at these loci are often underestimated with maximum lengths limited to either the read or fragment length, which is less than the pathogenic cutoff for some diseases. Although analysis tools can be customized to genotype extra loci, this requires proficiency in bioinformatics to set up, limiting their widespread usage by other researchers and clinicians. To address these issues, we have developed a new software called STRipy, which is able to target all known disease-causing STRs from HTS data. We created an intuitive graphical interface for STRipy and significantly simplified the detection of STRs expansions. Moreover, we genotyped all disease loci for over two and half thousand samples to provide population-wide distributions to assist with interpretation of results. We believe the simplicity and breadth of STRipy will increase the genotyping of STRs in sequencing data resulting in further diagnoses of rare STR diseases.
Collapse
Affiliation(s)
- Andreas Halman
- Peter MacCallum Cancer CentreMelbourneVictoriaAustralia
- Sir Peter MacCallum Department of OncologyThe University of MelbourneParkvilleVictoriaAustralia
- Murdoch Children's Research Institute, Royal Children's HospitalParkvilleVictoriaAustralia
- Florey Department of Neuroscience and Mental HealthThe University of MelbourneParkvilleVictoriaAustralia
- School of Natural Sciences and HealthTallinn UniversityTallinnEstonia
| | | | - Alicia Oshlack
- Peter MacCallum Cancer CentreMelbourneVictoriaAustralia
- Sir Peter MacCallum Department of OncologyThe University of MelbourneParkvilleVictoriaAustralia
- School of BioSciencesUniversity of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
40
|
Wen D, Shi J, Liu Y, He W, Qu W, Wang C, Xing H, Cao Y, Li J, Zha L. DNA methylation analysis for smoking status prediction in the Chinese population based on the methylation-sensitive single-nucleotide primer extension method. Forensic Sci Int 2022; 339:111412. [DOI: 10.1016/j.forsciint.2022.111412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 07/25/2022] [Accepted: 07/27/2022] [Indexed: 11/04/2022]
|
41
|
Liu Y, Cui W, Jin X, Wang K, Mei S, Zheng X, Zhu B. Forensic Efficiency Estimation of a Homemade Six-Color Fluorescence Multiplex Panel and In-Depth Anatomy of the Population Genetic Architecture in Two Tibetan Groups. Front Genet 2022; 13:880346. [PMID: 35692824 PMCID: PMC9184685 DOI: 10.3389/fgene.2022.880346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 04/06/2022] [Indexed: 11/28/2022] Open
Abstract
The genetic information of the Chinese Tibetan group has been a long-standing research hotspot among population geneticists and archaeologists. Herein, 309 unrelated individuals from two Tibetan groups living in Qinghai Province, China (CTQ), and Tibet Autonomous Region, China (CTT), were successfully genotyped using a new homemade six-color fluorescence multiplex panel, which contained 59 autosomal deletion/insertion polymorphisms (au-DIPs), two mini short tandem repeats (miniSTRs), two Y-chromosomal DIPs, and one Amelogenin. The cumulative probability of matching and combined power of exclusion values for this new panel in CTQ and CTT groups were 1.9253E-27 and 0.99999729, as well as 1.5061E-26 and 0.99999895, respectively. Subsequently, comprehensive population genetic analyses of Tibetan groups and reference populations were carried out based on the 59 au-DIPs. The multitudinous statistical analysis results supported that Tibetan groups have close genetic affinities with East Asian populations. These findings showed that this homemade system would be a powerful tool for forensic individual identification and paternity testing in Chinese Tibetan groups and give us an important insight for further perfecting the genetic landscape of Tibetan groups.
Collapse
Affiliation(s)
- Yanfang Liu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China.,Laboratory of Fundamental Nursing Research, School of Nursing, Guangdong Medical University, Dongguan, China
| | - Wei Cui
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Xiaoye Jin
- Department of Forensic Medicine, Guizhou Medical University, Guiyang, China
| | - Kang Wang
- Ningbo Health Gene Technologies Co., Ltd., Ningbo, China
| | - Shuyan Mei
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Xingkai Zheng
- Ningbo Health Gene Technologies Co., Ltd., Ningbo, China
| | - Bofeng Zhu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China.,Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China.,Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
42
|
Chiu R, Rajan-Babu IS, Birol I, Friedman JM. Linked-read sequencing for detecting short tandem repeat expansions. Sci Rep 2022; 12:9352. [PMID: 35672336 PMCID: PMC9174224 DOI: 10.1038/s41598-022-13024-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 05/19/2022] [Indexed: 11/09/2022] Open
Abstract
Detection of short tandem repeat (STR) expansions with standard short-read sequencing is challenging due to the difficulty in mapping multicopy repeat sequences. In this study, we explored how the long-range sequence information of barcode linked-read sequencing (BLRS) can be leveraged to improve repeat-read detection. We also devised a novel algorithm using BLRS barcodes for distance estimation and evaluated its application for STR genotyping. Both approaches were designed for genotyping large expansions (> 1 kb) that cannot be sized accurately by existing methods. Using simulated and experimental data of genomes with STR expansions from multiple BLRS platforms, we validated the utility of barcode and phasing information in attaining better STR genotypes compared to standard short-read sequencing. Although the coverage bias of extremely GC-rich STRs is an important limitation of BLRS, BLRS is an effective strategy for genotyping many other STR loci.
Collapse
Affiliation(s)
- Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 4S6, Canada
| | - Indhu-Shree Rajan-Babu
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V5Z 4H4, Canada.,Department of Medical and Molecular Genetics, King's College London, Strand, London, WC2R 2LS, UK
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 4S6, Canada. .,Department of Medical Genetics, University of British Columbia, Vancouver, BC, V5Z 4H4, Canada.
| | - Jan M Friedman
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V5Z 4H4, Canada.,BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
| |
Collapse
|
43
|
Sharpe JL, Harper NS, West RJH. Identification and Monitoring of Nucleotide Repeat Expansions Using Southern Blotting in Drosophila Models of C9orf72 Motor Neuron Disease and Frontotemporal Dementia. Bio Protoc 2022; 12:e4424. [PMID: 35813024 PMCID: PMC9183971 DOI: 10.21769/bioprotoc.4424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 03/28/2022] [Accepted: 03/29/2022] [Indexed: 12/29/2022] Open
Abstract
Repeat expansion diseases, including fragile X syndrome, Huntington's disease, and C9orf72-related motor neuron disease and frontotemporal dementia, are a group of disorders associated with polymorphic expansions of tandem repeat nucleotide sequences. These expansions are highly repetitive and often hundreds to thousands of repeats in length, making accurate identification and determination of repeat length via PCR or sequencing challenging. Here we describe a protocol for monitoring repeat length in Drosophila models carrying 1,000 repeat C9orf72-related dipeptide repeat transgenes using Southern blotting. This protocol has been used regularly to check the length of these lines for over 100 generations with robust and repeatable results and can be implemented for monitoring any repeat expansion in Drosophila.
Collapse
Affiliation(s)
- Joanne L. Sharpe
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
,Sheffield Institute for Translational Neuroscience, The University of Sheffield, Sheffield, United Kingdom
,Neuroscience Institute, The University of Sheffield, Sheffield, United Kingdom
| | - Nikki S. Harper
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Ryan J. H. West
- Sheffield Institute for Translational Neuroscience, The University of Sheffield, Sheffield, United Kingdom
,Neuroscience Institute, The University of Sheffield, Sheffield, United Kingdom
,
*For correspondence:
| |
Collapse
|
44
|
Liu Z, Zhao G, Xiao Y, Zeng S, Yuan Y, Zhou X, Fang Z, He R, Li B, Zhao Y, Pan H, Wang Y, Yu G, Peng IF, Wang D, Meng Q, Xu Q, Sun Q, Yan X, Shen L, Jiang H, Xia K, Wang J, Guo J, Liang F, Li J, Tang B. Profiling the Genome-Wide Landscape of Short Tandem Repeats by Long-Read Sequencing. Front Genet 2022; 13:810595. [PMID: 35601492 PMCID: PMC9117641 DOI: 10.3389/fgene.2022.810595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 03/30/2022] [Indexed: 11/23/2022] Open
Abstract
Background: Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases and the regulation of gene expression. Long-read sequencing (LRS) offers a potential solution to genome-wide STR analysis. However, characterizing STRs in human genomes using LRS on a large population scale has not been reported. Methods: We conducted the large LRS-based STR analysis in 193 unrelated samples of the Chinese population and performed genome-wide profiling of STR variation in the human genome. The repeat dynamic index (RDI) was introduced to evaluate the variability of STR. We sourced the expression data from the Genotype-Tissue Expression to explore the tissue specificity of highly variable STRs related genes across tissues. Enrichment analyses were also conducted to identify potential functional roles of the high variable STRs. Results: This study reports the large-scale analysis of human STR variation by LRS and offers a reference STR database based on the LRS dataset. We found that the disease-associated STRs (dSTRs) and STRs associated with the expression of nearby genes (eSTRs) were highly variable in the general population. Moreover, tissue-specific expression analysis showed that those highly variable STRs related genes presented the highest expression level in brain tissues, and enrichment pathways analysis found those STRs are involved in synaptic function-related pathways. Conclusion: Our study profiled the genome-wide landscape of STR using LRS and highlighted the highly variable STRs in the human genome, which provide a valuable resource for studying the role of STRs in human disease and complex traits.
Collapse
Affiliation(s)
- Zhenhua Liu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | | | - Sheng Zeng
- Department of Geriatrics, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Yanchun Yuan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Xun Zhou
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhenghuan Fang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Runcheng He
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Yuwen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Hongxu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | | | | | | | - Qingtuan Meng
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
| | - Qian Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Qiying Sun
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
| | - Xinxiang Yan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Lu Shen
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
| | - Kun Xia
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Junling Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Jifeng Guo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Jinchen Li
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, China
- Centre for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| | - Beisha Tang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital of University of South China, Hengyang, China
- Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, China
- *Correspondence: Beisha Tang, ; Jinchen Li, ; Fan Liang,
| |
Collapse
|
45
|
Wu Z, Gong H, Zhou Z, Jiang T, Lin Z, Li J, Xiao S, Yang B, Huang L. Mapping short tandem repeats for liver gene expression traits helps prioritize potential causal variants for complex traits in pigs. J Anim Sci Biotechnol 2022; 13:8. [PMID: 35034641 PMCID: PMC8762894 DOI: 10.1186/s40104-021-00658-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/25/2021] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Short tandem repeats (STRs) were recently found to have significant impacts on gene expression and diseases in humans, but their roles on gene expression and complex traits in pigs remain unexplored. This study investigates the effects of STRs on gene expression in liver tissues based on the whole-genome sequences and RNA-Seq data of a discovery cohort of 260 F6 individuals and a validation population of 296 F7 individuals from a heterogeneous population generated from crosses among eight pig breeds. RESULTS We identified 5203 and 5868 significantly expression STRs (eSTRs, FDR < 1%) in the F6 and F7 populations, respectively, most of which could be reciprocally validated (π1 = 0.92). The eSTRs explained 27.5% of the cis-heritability of gene expression traits on average. We further identified 235 and 298 fine-mapped STRs through the Bayesian fine-mapping approach in the F6 and F7 pigs, respectively, which were significantly enriched in intron, ATAC peak, compartment A and H3K4me3 regions. We identified 20 fine-mapped STRs located in 100 kb windows upstream and downstream of published complex trait-associated SNPs, which colocalized with epigenetic markers such as H3K27ac and ATAC peaks. These included eSTR of the CLPB, PGLS, PSMD6 and DHDH genes, which are linked with genome-wide association study (GWAS) SNPs for blood-related traits, leg conformation, growth-related traits, and meat quality traits, respectively. CONCLUSIONS This study provides insights into the effects of STRs on gene expression traits. The identified eSTRs are valuable resources for prioritizing causal STRs for complex traits in pigs.
Collapse
Affiliation(s)
- Zhongzi Wu
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Huanfa Gong
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Zhimin Zhou
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Tao Jiang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Ziqi Lin
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Jing Li
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Shijun Xiao
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China
| | - Bin Yang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| | - Lusheng Huang
- State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, China.
| |
Collapse
|
46
|
Han J, Munro JE, Kocoski A, Barry AE, Bahlo M. Population-level genome-wide STR discovery and validation for population structure and genetic diversity assessment of Plasmodium species. PLoS Genet 2022; 18:e1009604. [PMID: 35007277 PMCID: PMC8782505 DOI: 10.1371/journal.pgen.1009604] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 01/21/2022] [Accepted: 12/14/2021] [Indexed: 11/18/2022] Open
Abstract
Short tandem repeats (STRs) are highly informative genetic markers that have been used extensively in population genetics analysis. They are an important source of genetic diversity and can also have functional impact. Despite the availability of bioinformatic methods that permit large-scale genome-wide genotyping of STRs from whole genome sequencing data, they have not previously been applied to sequencing data from large collections of malaria parasite field samples. Here, we have genotyped STRs using HipSTR in more than 3,000 Plasmodium falciparum and 174 Plasmodium vivax published whole-genome sequence data from samples collected across the globe. High levels of noise and variability in the resultant callset necessitated the development of a novel method for quality control of STR genotype calls. A set of high-quality STR loci (6,768 from P. falciparum and 3,496 from P. vivax) were used to study Plasmodium genetic diversity, population structures and genomic signatures of selection and these were compared to genome-wide single nucleotide polymorphism (SNP) genotyping data. In addition, the genome-wide information about genetic variation and other characteristics of STRs in P. falciparum and P. vivax have been available in an interactive web-based R Shiny application PlasmoSTR (https://github.com/bahlolab/PlasmoSTR).
Collapse
Affiliation(s)
- Jiru Han
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
| | - Jacob E. Munro
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
| | - Anthony Kocoski
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia
| | - Alyssa E. Barry
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
- Disease Elimination Program, Burnet Institute, Melbourne, Australia
- IMPACT Institute for Innovation in Mental and Physical Health and Clinical Translation, Deakin University, Geelong, Australia
| | - Melanie Bahlo
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Australia
- * E-mail:
| |
Collapse
|
47
|
Xiao X, Zhang CY, Zhang Z, Hu Z, Li M, Li T. Revisiting tandem repeats in psychiatric disorders from perspectives of genetics, physiology, and brain evolution. Mol Psychiatry 2022; 27:466-475. [PMID: 34650204 DOI: 10.1038/s41380-021-01329-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 09/16/2021] [Accepted: 09/28/2021] [Indexed: 01/28/2023]
Abstract
Genome-wide association studies (GWASs) have revealed substantial genetic components comprised of single nucleotide polymorphisms (SNPs) in the heritable risk of psychiatric disorders. However, genetic risk factors not covered by GWAS also play pivotal roles in these illnesses. Tandem repeats, which are likely functional but frequently overlooked by GWAS, may account for an important proportion in the "missing heritability" of psychiatric disorders. Despite difficulties in characterizing and quantifying tandem repeats in the genome, studies have been carried out in an attempt to describe impact of tandem repeats on gene regulation and human phenotypes. In this review, we have introduced recent research progress regarding the genomic distribution and regulatory mechanisms of tandem repeats. We have also summarized the current knowledge of the genetic architecture and biological underpinnings of psychiatric disorders brought by studies of tandem repeats. These findings suggest that tandem repeats, in candidate psychiatric risk genes or in different levels of linkage disequilibrium (LD) with psychiatric GWAS SNPs and haplotypes, may modulate biological phenotypes related to psychiatric disorders (e.g., cognitive function and brain physiology) through regulating alternative splicing, promoter activity, enhancer activity and so on. In addition, many tandem repeats undergo tight natural selection in the human lineage, and likely exert crucial roles in human brain evolution. Taken together, the putative roles of tandem repeats in the pathogenesis of psychiatric disorders is strongly implicated, and using examples from previous literatures, we wish to call for further attention to tandem repeats in the post-GWAS era of psychiatric disorders.
Collapse
Affiliation(s)
- Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Chu-Yi Zhang
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Zhuohua Zhang
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China.,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Zhonghua Hu
- Institute of Molecular Precision Medicine and Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Department of Critical Care Medicine, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan, China. .,Hunan Key Laboratory of Animal Models for Human Diseases, School of Life Sciences, Central South University, Changsha, Hunan, China. .,Eye Center of Xiangya Hospital and Hunan Key Laboratory of Ophthalmology, Central South University, Changsha, Hunan, China. .,National Clinical Research Center on Mental Disorders, Changsha, Hunan, China.
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China. .,CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China. .,KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Tao Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China. .,Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Guangzhou, China.
| |
Collapse
|
48
|
Affiliation(s)
- Christel Depienne
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| |
Collapse
|
49
|
Schröder C, Horsthemke B, Depienne C. GC-rich repeat expansions: associated disorders and mechanisms. MED GENET-BERLIN 2021; 33:325-335. [PMID: 38835438 PMCID: PMC11006399 DOI: 10.1515/medgen-2021-2099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 11/12/2021] [Indexed: 06/06/2024]
Abstract
Noncoding repeat expansions are a well-known cause of genetic disorders mainly affecting the central nervous system. Missed by most standard technologies used in routine diagnosis, pathogenic noncoding repeat expansions have to be searched for using specific techniques such as repeat-primed PCR or specific bioinformatics tools applied to genome data, such as ExpansionHunter. In this review, we focus on GC-rich repeat expansions, which represent at least one third of all noncoding repeat expansions described so far. GC-rich expansions are mainly located in regulatory regions (promoter, 5' untranslated region, first intron) of genes and can lead to either a toxic gain-of-function mediated by RNA toxicity and/or repeat-associated non-AUG (RAN) translation, or a loss-of-function of the associated gene, depending on their size and their methylation status. We herein review the clinical and molecular characteristics of disorders associated with these difficult-to-detect expansions.
Collapse
Affiliation(s)
- Christopher Schröder
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Bernhard Horsthemke
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Christel Depienne
- Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| |
Collapse
|
50
|
Malik I, Kelley CP, Wang ET, Todd PK. Molecular mechanisms underlying nucleotide repeat expansion disorders. Nat Rev Mol Cell Biol 2021; 22:589-607. [PMID: 34140671 PMCID: PMC9612635 DOI: 10.1038/s41580-021-00382-6] [Citation(s) in RCA: 172] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/30/2021] [Indexed: 02/05/2023]
Abstract
The human genome contains over one million short tandem repeats. Expansion of a subset of these repeat tracts underlies over fifty human disorders, including common genetic causes of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (C9orf72), polyglutamine-associated ataxias and Huntington disease, myotonic dystrophy, and intellectual disability disorders such as Fragile X syndrome. In this Review, we discuss the four major mechanisms by which expansion of short tandem repeats causes disease: loss of function through transcription repression, RNA-mediated gain of function through gelation and sequestration of RNA-binding proteins, gain of function of canonically translated repeat-harbouring proteins, and repeat-associated non-AUG translation of toxic repeat peptides. Somatic repeat instability amplifies these mechanisms and influences both disease age of onset and tissue specificity of pathogenic features. We focus on the crosstalk between these disease mechanisms, and argue that they often synergize to drive pathogenesis. We also discuss the emerging native functions of repeat elements and how their dynamics might contribute to disease at a larger scale than currently appreciated. Lastly, we propose that lynchpins tying these disease mechanisms and native functions together offer promising therapeutic targets with potential shared applications across this class of human disorders.
Collapse
Affiliation(s)
- Indranil Malik
- Department of Neurology, University of Michigan, Ann Arbor, MI, USA
| | - Chase P Kelley
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics, Genetics Institute, University of Florida, Gainesville, FL, USA
- Genetics and Genomics Graduate Program, University of Florida, Gainesville, FL, USA
| | - Eric T Wang
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics, Genetics Institute, University of Florida, Gainesville, FL, USA.
| | - Peter K Todd
- Department of Neurology, University of Michigan, Ann Arbor, MI, USA.
- VA Ann Arbor Healthcare System, Ann Arbor, MI, USA.
| |
Collapse
|