1
|
Hou Q, Ji W, An K, Tan Y, Liu P, Su J. Genomic microsatellite characterization and development of polymorphic microsatellites in Eospalax baileyi. Sci Rep 2025; 15:524. [PMID: 39747356 PMCID: PMC11696105 DOI: 10.1038/s41598-024-84631-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Accepted: 12/25/2024] [Indexed: 01/04/2025] Open
Abstract
Microsatellite markers are cost-effective, rapid, efficient, and show great advantages in in large-sample kinship analysis and population structure studies. However, microsatellite loci are seriously underdeveloped in non-model organisms. The plateau zokor (Eospalax baileyi) is a key species living underground in the Tibetan Plateau, the effective management of which has long been challenging. In this study, we analyzed the distribution characteristics and functions of microsatellites in the genome of plateau zokors, and their polymorphic sites. The mononucleotide and dinucleotide types being the most abundant in the genome. The largest number of microsatellites and their abundance in the intergenic region whereas the smallest number of microsatellites and their abundance in the coding region. The coding sequences containing microsatellites were annotated to 52 major functional genes and assigned 19,358 Gene Ontology entries. The Kyoto Encyclopedia of Genes and Genomes pathway was the most enriched in the signal transduction pathway. Thirteen pairs of polymorphic loci were successfully amplified, with the number of alleles ranging from 3 to 8, observed heterozygosity ranging from 0.059 to 0.810, and expected heterozygosity ranging from 0.469 to 0.854. These microsatellite markers provide a cornerstone for studies on the identification of parentage and population genetics of plateau zokors.
Collapse
Affiliation(s)
- Qiqi Hou
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Weihong Ji
- Faculty of Science, University of Auckland, Auckland, New Zealand
| | - Kang An
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Yuchen Tan
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Penghui Liu
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Junhu Su
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China.
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China.
- Gansu Qilianshan Grassland Ecosystem Observation and Research Station, Wuwei, 733200, China.
| |
Collapse
|
2
|
Reinar WB, Krabberød AK, Lalun VO, Butenko MA, Jakobsen KS. Short tandem repeats delineate gene bodies across eukaryotes. Nat Commun 2024; 15:10902. [PMID: 39738068 PMCID: PMC11686069 DOI: 10.1038/s41467-024-55276-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 12/05/2024] [Indexed: 01/01/2025] Open
Abstract
Short tandem repeats (STRs) have emerged as important and hypermutable sites where genetic variation correlates with gene expression in plant and animal systems. Recently, it has been shown that a broad range of transcription factors (TFs) are affected by STRs near or in the DNA target binding site. Despite this, the distribution of STR motif repetitiveness in eukaryote genomes is still largely unknown. Here, we identify monomer and dimer STR motif repetitiveness in 5.1 billion 10-bp windows upstream of translation starts and downstream of translation stops in 25 million genes spanning 1270 species across the eukaryotic Tree of Life. We report that all surveyed genomes have gene-proximal shifts in motif repetitiveness. Within genomes, variation in gene-proximal repetitiveness landscapes correlated to the function of genes; genes with housekeeping functions were depleted in upstream and downstream repetitiveness. Furthermore, the repetitiveness landscapes correlated with TF binding sites, indicating that gene function has evolved in conjunction with cis-regulatory STRs and TFs that recognize repetitive sites. These results suggest that the hypermutability inherent to STRs is canalized along the genome sequence and contributes to regulatory and eco-evolutionary dynamics in all eukaryotes.
Collapse
Affiliation(s)
- William B Reinar
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, Oslo, Norway.
| | - Anders K Krabberød
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Vilde O Lalun
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Melinka A Butenko
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Kjetill S Jakobsen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
3
|
Qi X, Ullah A, Yu W, Jin X, Liu H. Estimating the Genetic Risk of First-Degree Relatives for Chronic Diseases Using the Short Tandem Repeat Score as Model of Polygenic Inheritance. Biochem Genet 2024:10.1007/s10528-024-11003-0. [PMID: 39733222 DOI: 10.1007/s10528-024-11003-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 12/10/2024] [Indexed: 12/30/2024]
Abstract
This study aims to establish a genetic risk assessment model based on a score of short tandem repeats (STRs) of polygenic inheritance. A total of 396 children and their biological parents were collected for STR genotyping. The numbers of tandem repeats of two alleles in one STR locus were assumed to be a quantitative genetic strength for disease incidence. The sums of 19 STR loci were considered a quantitative genetic strength per individual. Various thresholds of the STRs between paternal, maternal, and childhood data were recorded. As an exemplar, for thresholds of 25%, the first quarter = 1. All other samples = 0. The consistency rate for heredity (CH) was calculated from the difference in the morbidity of children between parents with and without disease groups. The ratio of observed CH to expected CH was defined as the heredity index (HI). Actual Pedigree data (finger-crossing test) confirmed the accuracy of the STR score. The genetic risk of first-degree relatives could be estimated using easily acquired data (incidence in an unrelated population). Our findings can provide a polygenic genetic model for estimating the incidence and genetic risk of chronic disease in first-degree relatives.
Collapse
Affiliation(s)
- Xia Qi
- College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China
| | - Anwar Ullah
- College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China
| | - Weijian Yu
- College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China
| | - Xiaojun Jin
- College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China
| | - Hui Liu
- College of Medical Laboratory, Dalian Medical University, Dalian, 116044, People's Republic of China.
| |
Collapse
|
4
|
Safaa HM, Helal M, Yasser S, Raafat Z, Ayman H, Mostafa H, Bozhilova-Sakova M, Elsayed DAA. Genome-Wide In Silico Analysis of Microsatellite Loci in Rabbits. Animals (Basel) 2024; 14:3659. [PMID: 39765563 PMCID: PMC11672705 DOI: 10.3390/ani14243659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2024] [Revised: 12/07/2024] [Accepted: 12/17/2024] [Indexed: 01/11/2025] Open
Abstract
This study aimed to characterize microsatellites in the rabbit genome using an in silico approach and to develop and validate microsatellite markers. Blood samples were collected from 15 Baladi rabbits and 18 New Zealand White (NZW rabbits). The GMATA software was used to define SSRs in the extracted sequences. Twelve primer pairs were used to validate the loci identified and the primers developed. The total number of the detected microsatellite loci overall chromosomes was 1,136,253. The di-nucleotide microsatellite repeats dominated and exceeded 88% of the detected microsatellites in all chromosomes. There were no microsatellites detected in mitochondrial DNA. The highest relative microsatellite abundance was obtained for chromosome 19, followed by 13 and 6. The highest estimated SSR density was obtained for chromosome 14, and the lowest was for mitochondrial DNA, followed by chromosome 13. The polymorphism was 81.63% and 75.51% for Baladi and NZW rabbits, respectively. The number of detected alleles ranged between two and seven alleles/loci, and polymorphic information content was from 35% to 71%. The AMOVA analysis showed that the total variance of all levels of population structure was 15.734. The results definitely confirmed higher genetic diversity in Baladi compared with NZW rabbits.
Collapse
Affiliation(s)
- Hosam M. Safaa
- Department of Biology, College of Science, University of Bisha, P.O. Box 551, Bisha 61922, Saudi Arabia
| | - Mostafa Helal
- Department of Animal Production, Faculty of Agriculture, Cairo University, Giza 12613, Egypt
| | - Seif Yasser
- Biotechnology Program, Faculty of Agriculture, Cairo University, Giza 12613, Egypt; (S.Y.); (Z.R.); (H.A.); (H.M.)
| | - Zahra Raafat
- Biotechnology Program, Faculty of Agriculture, Cairo University, Giza 12613, Egypt; (S.Y.); (Z.R.); (H.A.); (H.M.)
| | - Habiba Ayman
- Biotechnology Program, Faculty of Agriculture, Cairo University, Giza 12613, Egypt; (S.Y.); (Z.R.); (H.A.); (H.M.)
| | - Hasnaa Mostafa
- Biotechnology Program, Faculty of Agriculture, Cairo University, Giza 12613, Egypt; (S.Y.); (Z.R.); (H.A.); (H.M.)
| | | | - Dalia A. A. Elsayed
- Department of Poultry Breeding, Animal Production Research Institute, Agriculture Research Center, Dokki, Giza 12618, Egypt;
| |
Collapse
|
5
|
Haasl RJ, Payseur BA. Fitness landscapes of human microsatellites. PLoS Genet 2024; 20:e1011524. [PMID: 39775235 PMCID: PMC11734926 DOI: 10.1371/journal.pgen.1011524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 01/15/2025] [Accepted: 12/03/2024] [Indexed: 01/11/2025] Open
Abstract
Advances in DNA sequencing technology and computation now enable genome-wide scans for natural selection to be conducted on unprecedented scales. By examining patterns of sequence variation among individuals, biologists are identifying genes and variants that affect fitness. Despite this progress, most population genetic methods for characterizing selection assume that variants mutate in a simple manner and at a low rate. Because these assumptions are violated by repetitive sequences, selection remains uncharacterized for an appreciable percentage of the genome. To meet this challenge, we focus on microsatellites, repetitive variants that mutate orders of magnitude faster than single nucleotide variants, can harbor substantial variation, and are known to influence biological function in some cases. We introduce four general models of natural selection that are each characterized by just two parameters, are easily simulated, and are specifically designed for microsatellites. Using a random forests approach to approximate Bayesian computation, we fit these models to carefully chosen microsatellites genotyped in 200 humans from a diverse collection of eight populations. Altogether, we reconstruct detailed fitness landscapes for 43 microsatellites we classify as targets of selection. Microsatellite fitness surfaces are diverse, including a range of selection strengths, contributions from dominance, and variation in the number and size of optimal alleles. Microsatellites that are subject to selection include loci known to cause trinucleotide expansion disorders and modulate gene expression, as well as intergenic loci with no obvious function. The heterogeneity in fitness landscapes we report suggests that genome-scale analyses like those used to assess selection targeting single nucleotide variants run the risk of oversimplifying the evolutionary dynamics of microsatellites. Moreover, our fitness landscapes provide a valuable visualization of the selective dynamics navigated by microsatellites.
Collapse
Affiliation(s)
- Ryan J. Haasl
- Department of Biology, University of Wisconsin-Platteville, Platteville, Wisconsin, United States of America
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
6
|
Provatas K, Chantzi N, Patsakis M, Nayak A, Mouratidis I, Georgakopoulos-Soares I. Microsatellites explorer: A database of short tandem repeats across genomes. Comput Struct Biotechnol J 2024; 23:3817-3826. [PMID: 39525087 PMCID: PMC11550718 DOI: 10.1016/j.csbj.2024.10.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 10/24/2024] [Accepted: 10/24/2024] [Indexed: 11/16/2024] Open
Abstract
Short tandem repeats (STRs) are widespread, repetitive elements, with a number of biological functions and are among the most rapidly mutating regions in the genome. Their distribution varies significantly between taxonomic groups in the tree of life and are highly polymorphic within the human population. Advances in sequencing technologies coupled with decreasing costs have enabled the generation of an ever-growing number of complete genomes. Additionally, the arrival of accurate long reads has facilitated the generation of Telomere-to-Telomere (T2T) assemblies of complete genomes. Nevertheless, there is no comprehensive database that encompasses the STRs found per genome across different organisms and for different human genomes across diverse ancestries. Here we introduce Microsatellites Explorer, a database of STRs found in the genomes of 117,253 organisms across all major taxonomic groups, 15 T2T genome assemblies of different organisms, and 94 human haplotypes from the human pangenome. The database currently hosts 406,758,798 STR sequences, serving as a centralized user-friendly repository to perform searches, interactive visualizations, and download existing STR data for independent analysis. Microsatellites Explorer is implemented as a web-portal for browsing, analyzing and downloading STR data. Microsatellites Explorer is publicly available at https://www.microsatellitesexplorer.com.
Collapse
Affiliation(s)
- Kimonas Provatas
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Michail Patsakis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Akshatha Nayak
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
7
|
Yilmaz F, Karageorgiou C, Kim K, Pajic P, Scheer K, Beck CR, Torregrossa AM, Lee C, Gokcumen O. Reconstruction of the human amylase locus reveals ancient duplications seeding modern-day variation. Science 2024; 386:eadn0609. [PMID: 39418342 PMCID: PMC11707797 DOI: 10.1126/science.adn0609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/27/2024] [Accepted: 09/24/2024] [Indexed: 10/19/2024]
Abstract
Previous studies suggested that the copy number of the human salivary amylase gene, AMY1, correlates with starch-rich diets. However, evolutionary analyses are hampered by the absence of accurate, sequence-resolved haplotype variation maps. We identified 30 structurally distinct haplotypes at nucleotide resolution among 98 present-day humans, revealing that the coding sequences of AMY1 copies are evolving under negative selection. Genomic analyses of these haplotypes in archaic hominins and ancient human genomes suggest that a common three-copy haplotype, dating as far back as 800,000 years ago, has seeded rapidly evolving rearrangements through recurrent nonallelic homologous recombination. Additionally, haplotypes with more than three AMY1 copies have significantly increased in frequency among European farmers over the past 4000 years, potentially as an adaptive response to increased starch digestion.
Collapse
Affiliation(s)
- Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | | | - Kwondo Kim
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | - Petar Pajic
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| | - Kendra Scheer
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| | | | - Christine R. Beck
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
- University of Connecticut, Institute for Systems Genomics,
Storrs, CT, USA
- The University of Connecticut Health Center, Farmington,
CT, USA
| | - Ann-Marie Torregrossa
- Department of Psychology, University at Buffalo, Buffalo,
NY, USA
- University at Buffalo Center for Ingestive Behavior
Research, University at Buffalo, Buffalo, NY, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| |
Collapse
|
8
|
Redelings BD, Holmes I, Lunter G, Pupko T, Anisimova M. Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications. Mol Biol Evol 2024; 41:msae177. [PMID: 39172750 PMCID: PMC11385596 DOI: 10.1093/molbev/msae177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 07/02/2024] [Accepted: 07/09/2024] [Indexed: 08/24/2024] Open
Abstract
Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations. Yet, despite their importance, evolutionary studies have traditionally ignored or mishandled insertions and deletions due to a lack of comprehensive methodologies and statistical models of insertions and deletion dynamics. Here, we discuss methods for describing insertions and deletion variation and modeling insertions and deletions over evolutionary time. We provide practical advice for tackling insertions and deletions in genomic sequences and illustrate our discussion with examples of insertions and deletion-induced effects in human and other natural populations and their contribution to evolutionary processes. We outline promising directions for future developments in statistical methodologies that would allow researchers to analyze insertions and deletion variation and their effects in large genomic data sets and to incorporate insertions and deletions in evolutionary inference.
Collapse
Affiliation(s)
| | - Ian Holmes
- Department of Bioengineering, University of California, Berkeley, CA 94720, USA
- Calico Life Sciences LLC, South San Francisco, CA 94080, USA
| | - Gerton Lunter
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen 9713 GZ, The Netherlands
| | - Tal Pupko
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Maria Anisimova
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
9
|
Uguen K, Michaud JL, Génin E. Short Tandem Repeats in the era of next-generation sequencing: from historical loci to population databases. Eur J Hum Genet 2024; 32:1037-1044. [PMID: 38982300 PMCID: PMC11369099 DOI: 10.1038/s41431-024-01666-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/11/2024] Open
Abstract
In this study, we explore the landscape of short tandem repeats (STRs) within the human genome through the lens of evolving technologies to detect genomic variations. STRs, which encompass approximately 3% of our genomic DNA, are crucial for understanding human genetic diversity, disease mechanisms, and evolutionary biology. The advent of high-throughput sequencing methods has revolutionized our ability to accurately map and analyze STRs, highlighting their significance in genetic disorders, forensic science, and population genetics. We review the current available methodologies for STR analysis, the challenges in interpreting STR variations across different populations, and the implications of STRs in medical genetics. Our findings underscore the urgent need for comprehensive STR databases that reflect the genetic diversity of global populations, facilitating the interpretation of STR data in clinical diagnostics, genetic research, and forensic applications. This work sets the stage for future studies aimed at harnessing STR variations to elucidate complex genetic traits and diseases, reinforcing the importance of integrating STRs into genetic research and clinical practice.
Collapse
Affiliation(s)
- Kevin Uguen
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.
- Service de Génétique Médicale et Biologie de la Reproduction, CHU de Brest, Brest, France.
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada.
| | - Jacques L Michaud
- CHU Sainte-Justine Azrieli Research Centre, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
- Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | | |
Collapse
|
10
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Goldberg ME, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. A familial, telomere-to-telomere reference for human de novo mutation and recombination from a four-generation pedigree. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.05.606142. [PMID: 39149261 PMCID: PMC11326147 DOI: 10.1101/2024.08.05.606142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess de novo mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 de novo single-nucleotide variants (SNVs), 7.4 non-tandem repeat indels, 79.6 de novo indels or structural variants (SVs) originating from tandem repeats, 7.7 centromeric de novo SVs and SNVs, and 12.4 de novo Y chromosome events per generation. STRs and VNTRs are the most mutable with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations, documenting de novo SVs, and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length, and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 17% of de novo SNVs are postzygotic in origin with no paternal bias. We place all this variation in the context of a high-resolution recombination map (~3.5 kbp breakpoint resolution). We observe a strong maternal recombination bias (1.36 maternal:paternal ratio) with a consistent reduction in the number of crossovers with increasing paternal (r=0.85) and maternal (r=0.65) age. However, we observe no correlation between meiotic crossover locations and de novo SVs, arguing against non-allelic homologous recombination as a predominant mechanism. The use of multiple orthogonal technologies, near-telomere-to-telomere phased genome assemblies, and a multi-generation family to assess transmission has created the most comprehensive, publicly available "truth set" of all classes of genomic variants. The resource can be used to test and benchmark new algorithms and technologies to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William J Rowell
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Kirill Grigorev
- Blue Marble Space Institute of Science, Seattle, WA, USA
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Present address: Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | | | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Shawn Levy
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
11
|
Vegezzi E, Ishiura H, Bragg DC, Pellerin D, Magrinelli F, Currò R, Facchini S, Tucci A, Hardy J, Sharma N, Danzi MC, Zuchner S, Brais B, Reilly MM, Tsuji S, Houlden H, Cortese A. Neurological disorders caused by novel non-coding repeat expansions: clinical features and differential diagnosis. Lancet Neurol 2024; 23:725-739. [PMID: 38876750 DOI: 10.1016/s1474-4422(24)00167-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/04/2024] [Accepted: 04/09/2024] [Indexed: 06/16/2024]
Abstract
Nucleotide repeat expansions in the human genome are a well-known cause of neurological disease. In the past decade, advances in DNA sequencing technologies have led to a better understanding of the role of non-coding DNA, that is, the DNA that is not transcribed into proteins. These techniques have also enabled the identification of pathogenic non-coding repeat expansions that cause neurological disorders. Mounting evidence shows that adult patients with familial or sporadic presentations of epilepsy, cognitive dysfunction, myopathy, neuropathy, ataxia, or movement disorders can be carriers of non-coding repeat expansions. The description of the clinical, epidemiological, and molecular features of these recently identified non-coding repeat expansion disorders should guide clinicians in the diagnosis and management of these patients, and help in the genetic counselling for patients and their families.
Collapse
Affiliation(s)
| | - Hiroyuki Ishiura
- Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - D Cristopher Bragg
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - David Pellerin
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK; Department of Neurology and Neurosurgery, Montreal Neurological Hospital and Institute, McGill University, Montreal, QC, Canada
| | - Francesca Magrinelli
- Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK
| | - Riccardo Currò
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK; Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
| | - Stefano Facchini
- IRCCS Mondino Foundation, Pavia, Italy; Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK
| | - Arianna Tucci
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK; William Harvey Research Institute, Queen Mary University of London, London, UK
| | - John Hardy
- Department of Neurogedengerative Disease, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK
| | - Nutan Sharma
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Matt C Danzi
- Department of Human Genetics and Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Stephan Zuchner
- Department of Human Genetics and Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Bernard Brais
- Department of Neurology and Neurosurgery, Montreal Neurological Hospital and Institute, McGill University, Montreal, QC, Canada
| | - Mary M Reilly
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK
| | - Shoji Tsuji
- Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Institute of Medical Genomics, International University of Health and Welfare, Chiba, Japan
| | - Henry Houlden
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK
| | - Andrea Cortese
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, UK; Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy.
| |
Collapse
|
12
|
Rajan-Babu IS, Dolzhenko E, Eberle MA, Friedman JM. Sequence composition changes in short tandem repeats: heterogeneity, detection, mechanisms and clinical implications. Nat Rev Genet 2024; 25:476-499. [PMID: 38467784 DOI: 10.1038/s41576-024-00696-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2024] [Indexed: 03/13/2024]
Abstract
Short tandem repeats (STRs) are a class of repetitive elements, composed of tandem arrays of 1-6 base pair sequence motifs, that comprise a substantial fraction of the human genome. STR expansions can cause a wide range of neurological and neuromuscular conditions, known as repeat expansion disorders, whose age of onset, severity, penetrance and/or clinical phenotype are influenced by the length of the repeats and their sequence composition. The presence of non-canonical motifs, depending on the type, frequency and position within the repeat tract, can alter clinical outcomes by modifying somatic and intergenerational repeat stability, gene expression and mutant transcript-mediated and/or protein-mediated toxicities. Here, we review the diverse structural conformations of repeat expansions, technological advances for the characterization of changes in sequence composition, their clinical correlations and the impact on disease mechanisms.
Collapse
Affiliation(s)
- Indhu-Shree Rajan-Babu
- Department of Medical Genetics, The University of British Columbia, and Children's & Women's Hospital, Vancouver, British Columbia, Canada.
| | | | | | - Jan M Friedman
- Department of Medical Genetics, The University of British Columbia, and Children's & Women's Hospital, Vancouver, British Columbia, Canada
- BC Children's Hospital Research Institute, Vancouver, British Columbia, Canada
| |
Collapse
|
13
|
Liang Y, Hao J, Wang J, Zhang G, Su Y, Liu Z, Wang T. Statistical Genomics Analysis of Simple Sequence Repeats from the Paphiopedilum Malipoense Transcriptome Reveals Control Knob Motifs Modulating Gene Expression. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2304848. [PMID: 38647414 PMCID: PMC11200097 DOI: 10.1002/advs.202304848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 02/26/2024] [Indexed: 04/25/2024]
Abstract
Simple sequence repeats (SSRs) are found in nonrandom distributions in genomes and are thought to impact gene expression. The distribution patterns of 48 295 SSRs of Paphiopedilum malipoense are mined and characterized based on the first full-length transcriptome and comprehensive transcriptome dataset from 12 organs. Statistical genomics analyses are used to investigate how SSRs in transcripts affect gene expression. The results demonstrate the correlations between SSR distributions, characteristics, and expression level. Nine expression-modulating motifs (expMotifs) are identified and a model is proposed to explain the effect of their key features, potency, and gene function on an intra-transcribed region scale. The expMotif-transcribed region combination is the most predominant contributor to the expression-modulating effect of SSRs, and some intra-transcribed regions are critical for this effect. Genes containing the same type of expMotif-SSR elements in the same transcribed region are likely linked in function, regulation, or evolution aspects. This study offers novel evidence to understand how SSRs regulate gene expression and provides potential regulatory elements for plant genetic engineering.
Collapse
Affiliation(s)
- Yingyi Liang
- College of Life SciencesSouth China Agricultural UniversityGuangzhou510642China
| | - Jing Hao
- College of Life SciencesSouth China Agricultural UniversityGuangzhou510642China
| | - Jieyu Wang
- College of Forestry and Landscape ArchitectureSouth China Agricultural UniversityGuangzhou510642China
| | - Guoqiang Zhang
- Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization at College of Landscape Architecture and ArtFujian Agriculture and Forestry UniversityFuzhou350002China
| | - Yingjuan Su
- School of Life SciencesSun Yat‐sen UniversityGuangzhou510275China
- Research Institute of Sun Yat‐sen University in ShenzhenShenzhen518107China
| | - Zhong‐Jian Liu
- Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization at College of Landscape Architecture and ArtFujian Agriculture and Forestry UniversityFuzhou350002China
| | - Ting Wang
- College of Life SciencesSouth China Agricultural UniversityGuangzhou510642China
| |
Collapse
|
14
|
Avellaneda LL, Johnson DT, Gutierrez R, Thompson L, Sage KA, Sturm SA, Houston RM, LaRue BL. Development of a novel five-dye panel for human identification insertion/deletion (INDEL) polymorphisms. J Forensic Sci 2024; 69:814-824. [PMID: 38291825 DOI: 10.1111/1556-4029.15475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 01/15/2024] [Accepted: 01/18/2024] [Indexed: 02/01/2024]
Abstract
DNA analysis of forensic case samples relies on short tandem repeats (STRs), a key component of the combined DNA index system (CODIS) used to identify individuals. However, limitations arise when dealing with challenging samples, prompting the exploration of alternative markers such as single nucleotide polymorphisms (SNPs) and insertion/deletion (INDELs) polymorphisms. Unlike SNPs, INDELs can be differentiated easily by size, making them compatible with electrophoresis methods. It is possible to design small INDEL amplicons (<200 bp) to enhance recovery from degraded samples. To this end, a set of INDEL Human Identification Markers (HID) was curated from the 1000 Genomes Project, employing criteria including a fixation index (FST) ≤ 0.06, minor allele frequency (MAF) >0.2, and high allele frequency divergence. A panel of 33 INDEL-HIDs was optimized and validated following the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines, utilizing a five-dye multiplex electrophoresis system. A small sample set (n = 79 unrelated individuals) was genotyped to assess the assay's performance. The validation studies exhibited reproducibility, inhibition tolerance, ability to detect a two-person mixture from a 4:1 to 1:6 ratio, robustness with challenging samples, and sensitivity down to 125 pg of DNA. In summary, the 33-loci INDEL-HID panel exhibited robust recovery with low-template and degraded samples and proved effective for individualization within a small sample set.
Collapse
Affiliation(s)
- Lucio L Avellaneda
- Department of Forensic Science, Sam Houston State University, Huntsville, Texas, USA
| | - Damani T Johnson
- Department of Forensic Science, Sam Houston State University, Huntsville, Texas, USA
| | - Ryan Gutierrez
- Department of Forensic Science, Sam Houston State University, Huntsville, Texas, USA
| | - Lindsey Thompson
- Institute of Applied Genetics, Department of Molecular and Medical Genetics, University of North Texas Health Science Center, Fort Worth, Texas, USA
| | - Kelly A Sage
- Institute of Applied Genetics, Department of Molecular and Medical Genetics, University of North Texas Health Science Center, Fort Worth, Texas, USA
| | - Sarah A Sturm
- Institute of Applied Genetics, Department of Molecular and Medical Genetics, University of North Texas Health Science Center, Fort Worth, Texas, USA
| | - Rachel M Houston
- Department of Forensic Science, Sam Houston State University, Huntsville, Texas, USA
| | - Bobby L LaRue
- Department of Forensic Science, Sam Houston State University, Huntsville, Texas, USA
- Institute of Applied Genetics, Department of Molecular and Medical Genetics, University of North Texas Health Science Center, Fort Worth, Texas, USA
| |
Collapse
|
15
|
Liu P, Wilson P, Redquest B, Keobouasone S, Manseau M. Seq2Sat and SatAnalyzer toolkit: Towards comprehensive microsatellite genotyping from sequencing data. Mol Ecol Resour 2024; 24:e13929. [PMID: 38289068 DOI: 10.1111/1755-0998.13929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/22/2023] [Accepted: 01/09/2024] [Indexed: 03/06/2024]
Abstract
Accurate and efficient microsatellite loci genotyping is an essential process in population genetics that is also used in various demographic analyses. Protocols for next-generation sequencing of microsatellite loci enable high-throughput and cross-compatible allele scoring, common issues that are not addressed by conventional capillary-based approaches. To improve this process, we have developed an all-in-one software, called Seq2Sat (sequence to microsatellite), in C++ to support automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and conducts read quality control before inferring genotypes based on depth-of-read, read ratio, sequence composition and length. We have also developed a module for sex identification based on sex chromosome-specific locus amplicons. To allow for greater user access and complement autoscoring, we developed SatAnalyzer (microsatellite analyzer), a user-friendly web-based platform that conducts reads-to-report analyses by calling Seq2Sat for genotype autoscoring and produces interactive genotype graphs for manual editing. SatAnalyzer also allows users to troubleshoot multiplex optimization by analysing read quality and distribution across loci and samples in support of high-quality library preparation. To evaluate its performance, we benchmarked our toolkit Seq2Sat/SatAnalyzer against a conventional capillary gel method and existing microsatellite genotyping software, MEGASAT, using two datasets. Results showed that SatAnalyzer can achieve >99.70% genotyping accuracy and Seq2Sat is ~5 times faster than MEGASAT despite many more informative tables and figures being generated. Seq2Sat and SatAnalyzer are freely available on github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).
Collapse
Affiliation(s)
- Peng Liu
- Science and Technology, Environment and Climate Change Canada, Ottawa, Ontario, Canada
| | - Paul Wilson
- Biology Department, Trent University, Peterborough, Ontario, Canada
| | | | - Sonesinh Keobouasone
- Science and Technology, Environment and Climate Change Canada, Ottawa, Ontario, Canada
| | - Micheline Manseau
- Science and Technology, Environment and Climate Change Canada, Ottawa, Ontario, Canada
| |
Collapse
|
16
|
McComish BJ, Charleston MA, Parks M, Baroni C, Salvatore MC, Li R, Zhang G, Millar CD, Holland BR, Lambert DM. Ancient and Modern Genomes Reveal Microsatellites Maintain a Dynamic Equilibrium Through Deep Time. Genome Biol Evol 2024; 16:evae017. [PMID: 38412309 PMCID: PMC10972684 DOI: 10.1093/gbe/evae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 12/22/2023] [Accepted: 01/23/2024] [Indexed: 02/29/2024] Open
Abstract
Microsatellites are widely used in population genetics, but their evolutionary dynamics remain poorly understood. It is unclear whether microsatellite loci drift in length over time. This is important because the mutation processes that underlie these important genetic markers are central to the evolutionary models that employ microsatellites. We identify more than 27 million microsatellites using a novel and unique dataset of modern and ancient Adélie penguin genomes along with data from 63 published chordate genomes. We investigate microsatellite evolutionary dynamics over 2 timescales: one based on Adélie penguin samples dating to ∼46.5 ka and the other dating to the diversification of chordates aged more than 500 Ma. We show that the process of microsatellite allele length evolution is at dynamic equilibrium; while there is length polymorphism among individuals, the length distribution for a given locus remains stable. Many microsatellites persist over very long timescales, particularly in exons and regulatory sequences. These often retain length variability, suggesting that they may play a role in maintaining phenotypic variation within populations.
Collapse
Affiliation(s)
- Bennet J McComish
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS 7001, Australia
| | | | - Matthew Parks
- Australian Research Centre for Human Evolution, Griffith University, Nathan, QLD 4111, Australia
- Department of Biology, University of Central Oklahoma, Edmond, OK 73034, USA
| | - Carlo Baroni
- Dipartimento di Scienze della Terra, University of Pisa, Pisa, Italy
- CNR-IGG, Institute of Geosciences and Earth Resources, Pisa, Italy
| | - Maria Cristina Salvatore
- Dipartimento di Scienze della Terra, University of Pisa, Pisa, Italy
- CNR-IGG, Institute of Geosciences and Earth Resources, Pisa, Italy
| | - Ruiqiang Li
- Novogene Bioinformatics Technology Co. Ltd., Beijing 100083, China
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
- Department of Biology, Centre for Social Evolution, University of Copenhagen, Copenhagen DK-2100, Denmark
| | - Craig D Millar
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| | - David M Lambert
- Australian Research Centre for Human Evolution, Griffith University, Nathan, QLD 4111, Australia
| |
Collapse
|
17
|
Verbiest MA, Lundström O, Xia F, Baudis M, Bilgin Sonay T, Anisimova M. Short tandem repeat mutations regulate gene expression in colorectal cancer. Sci Rep 2024; 14:3331. [PMID: 38336885 PMCID: PMC10858039 DOI: 10.1038/s41598-024-53739-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/04/2024] [Indexed: 02/12/2024] Open
Abstract
Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression data to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. Future extensions of these findings could uncover new STR-based targets in the treatment of cancer.
Collapse
Affiliation(s)
- Max A Verbiest
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland.
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | - Oxana Lundström
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Feifei Xia
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michael Baudis
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Tugce Bilgin Sonay
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Ecology, Evolution and Environmental Biology, Columbia University, New York, USA
| | - Maria Anisimova
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
18
|
Hong EP, Ramos EM, Aziz NA, Massey TH, McAllister B, Lobanov S, Jones L, Holmans P, Kwak S, Orth M, Ciosi M, Lomeikaite V, Monckton DG, Long JD, Lucente D, Wheeler VC, Gillis T, MacDonald ME, Sequeiros J, Gusella JF, Lee JM. Modification of Huntington's disease by short tandem repeats. Brain Commun 2024; 6:fcae016. [PMID: 38449714 PMCID: PMC10917446 DOI: 10.1093/braincomms/fcae016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/20/2023] [Accepted: 01/22/2024] [Indexed: 03/08/2024] Open
Abstract
Expansions of glutamine-coding CAG trinucleotide repeats cause a number of neurodegenerative diseases, including Huntington's disease and several of spinocerebellar ataxias. In general, age-at-onset of the polyglutamine diseases is inversely correlated with the size of the respective inherited expanded CAG repeat. Expanded CAG repeats are also somatically unstable in certain tissues, and age-at-onset of Huntington's disease corrected for individual HTT CAG repeat length (i.e. residual age-at-onset), is modified by repeat instability-related DNA maintenance/repair genes as demonstrated by recent genome-wide association studies. Modification of one polyglutamine disease (e.g. Huntington's disease) by the repeat length of another (e.g. ATXN3, CAG expansions in which cause spinocerebellar ataxia 3) has also been hypothesized. Consequently, we determined whether age-at-onset in Huntington's disease is modified by the CAG repeats of other polyglutamine disease genes. We found that the CAG measured repeat sizes of other polyglutamine disease genes that were polymorphic in Huntington's disease participants but did not influence Huntington's disease age-at-onset. Additional analysis focusing specifically on ATXN3 in a larger sample set (n = 1388) confirmed the lack of association between Huntington's disease residual age-at-onset and ATXN3 CAG repeat length. Additionally, neither our Huntington's disease onset modifier genome-wide association studies single nucleotide polymorphism data nor imputed short tandem repeat data supported the involvement of other polyglutamine disease genes in modifying Huntington's disease. By contrast, our genome-wide association studies based on imputed short tandem repeats revealed significant modification signals for other genomic regions. Together, our short tandem repeat genome-wide association studies show that modification of Huntington's disease is associated with short tandem repeats that do not involve other polyglutamine disease-causing genes, refining the landscape of Huntington's disease modification and highlighting the importance of rigorous data analysis, especially in genetic studies testing candidate modifiers.
Collapse
Affiliation(s)
- Eun Pyo Hong
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Eliana Marisa Ramos
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - N Ahmad Aziz
- Population & Clinical Neuroepidemiology, German Center for Neurodegenerative Diseases, 53127 Bonn, Germany
- Department of Neurology, Faculty of Medicine, University of Bonn, Bonn D-53113, Germany
| | - Thomas H Massey
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Branduff McAllister
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Sergey Lobanov
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Lesley Jones
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Peter Holmans
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Seung Kwak
- Molecular System Biology, CHDI Foundation, Princeton, NJ 08540, USA
| | - Michael Orth
- University Hospital of Old Age Psychiatry and Psychotherapy, Bern University, CH-3000 Bern 60, Switzerland
| | - Marc Ciosi
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Vilija Lomeikaite
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Darren G Monckton
- School of Molecular Biosciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Jeffrey D Long
- Department of Psychiatry, Carver College of Medicine and Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA 52242, USA
| | - Diane Lucente
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Vanessa C Wheeler
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Tammy Gillis
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Marcy E MacDonald
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| | - Jorge Sequeiros
- UnIGENe, IBMC—Institute for Molecular and Cell Biology, i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto 420-135, Portugal
- ICBAS School of Medicine and Biomedical Sciences, University of Porto, Porto 420-135, Portugal
| | - James F Gusella
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Jong-Min Lee
- Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
- Medical and Population Genetics Program, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
19
|
Liu X, Chen W, Huang B, Wang X, Peng Y, Zhang X, Chai W, Khan MZ, Wang C. Advancements in copy number variation screening in herbivorous livestock genomes and their association with phenotypic traits. Front Vet Sci 2024; 10:1334434. [PMID: 38274664 PMCID: PMC10808162 DOI: 10.3389/fvets.2023.1334434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 12/27/2023] [Indexed: 01/27/2024] Open
Abstract
Copy number variations (CNVs) have garnered increasing attention within the realm of genetics due to their prevalence in human, animal, and plant genomes. These structural genetic variations have demonstrated associations with a broad spectrum of phenotypic diversity, economic traits, environmental adaptations, epidemics, and other essential aspects of both plants and animals. Furthermore, CNVs exhibit extensive sequence variability and encompass a wide array of genomes. The advancement and maturity of microarray and sequencing technologies have catalyzed a surge in research endeavors pertaining to CNVs. This is particularly prominent in the context of livestock breeding, where molecular markers have gained prominence as a valuable tool in comparison to traditional breeding methods. In light of these developments, a contemporary and comprehensive review of existing studies on CNVs becomes imperative. This review serves the purpose of providing a brief elucidation of the fundamental concepts underlying CNVs, their mutational mechanisms, and the diverse array of detection methods employed to identify these structural variations within genomes. Furthermore, it seeks to systematically analyze the recent advancements and findings within the field of CNV research, specifically within the genomes of herbivorous livestock species, including cattle, sheep, horses, and donkeys. The review also highlighted the role of CNVs in shaping various phenotypic traits including growth traits, reproductive traits, pigmentation and disease resistance etc., in herbivorous livestock. The main goal of this review is to furnish readers with an up-to-date compilation of knowledge regarding CNVs in herbivorous livestock genomes. By integrating the latest research findings and insights, it is anticipated that this review will not only offer pertinent information but also stimulate future investigations into the realm of CNVs in livestock. In doing so, it endeavors to contribute to the enhancement of breeding strategies, genomic selection, and the overall improvement of herbivorous livestock production and resistance to diseases.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Muhammad Zahoor Khan
- Liaocheng Research Institute of Donkey High-Efficiency Breeding, Liaocheng University, Liaocheng, China
| | - Changfa Wang
- Liaocheng Research Institute of Donkey High-Efficiency Breeding, Liaocheng University, Liaocheng, China
| |
Collapse
|
20
|
Hannan AJ. Expanding horizons of tandem repeats in biology and medicine: Why 'genomic dark matter' matters. Emerg Top Life Sci 2023; 7:ETLS20230075. [PMID: 38088823 PMCID: PMC10754335 DOI: 10.1042/etls20230075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023]
Abstract
Approximately half of the human genome includes repetitive sequences, and these DNA sequences (as well as their transcribed repetitive RNA and translated amino-acid repeat sequences) are known as the repeatome. Within this repeatome there are a couple of million tandem repeats, dispersed throughout the genome. These tandem repeats have been estimated to constitute ∼8% of the entire human genome. These tandem repeats can be located throughout exons, introns and intergenic regions, thus potentially affecting the structure and function of tandemly repetitive DNA, RNA and protein sequences. Over more than three decades, more than 60 monogenic human disorders have been found to be caused by tandem-repeat mutations. These monogenic tandem-repeat disorders include Huntington's disease, a variety of ataxias, amyotrophic lateral sclerosis and frontotemporal dementia, as well as many other neurodegenerative diseases. Furthermore, tandem-repeat disorders can include fragile X syndrome, related fragile X disorders, as well as other neurological and psychiatric disorders. However, these monogenic tandem-repeat disorders, which were discovered via their dominant or recessive modes of inheritance, may represent the 'tip of the iceberg' with respect to tandem-repeat contributions to human disorders. A previous proposal that tandem repeats may contribute to the 'missing heritability' of various common polygenic human disorders has recently been supported by a variety of new evidence. This includes genome-wide studies that associate tandem-repeat mutations with autism, schizophrenia, Parkinson's disease and various types of cancers. In this article, I will discuss how tandem-repeat mutations and polymorphisms could contribute to a wide range of common disorders, along with some of the many major challenges of tandem-repeat biology and medicine. Finally, I will discuss the potential of tandem repeats to be therapeutically targeted, so as to prevent and treat an expanding range of human disorders.
Collapse
Affiliation(s)
- Anthony J Hannan
- Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Victoria 3010, Australia
- Department of Anatomy and Physiology, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
21
|
Lundström OS, Adriaan Verbiest M, Xia F, Jam HZ, Zlobec I, Anisimova M, Gymrek M. WebSTR: A Population-wide Database of Short Tandem Repeat Variation in Humans. J Mol Biol 2023; 435:168260. [PMID: 37678708 DOI: 10.1016/j.jmb.2023.168260] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 08/29/2023] [Accepted: 08/29/2023] [Indexed: 09/09/2023]
Abstract
Short tandem repeats (STRs) are consecutive repetitions of one to six nucleotide motifs. They are hypervariable due to the high prevalence of repeat unit insertions or deletions primarily caused by polymerase slippage during replication. Genetic variation at STRs has been shown to influence a range of traits in humans, including gene expression, cancer risk, and autism. Until recently STRs have been poorly studied since they pose significant challenges to bioinformatics analyses. Moreover, genome-wide analysis of STR variation in population-scale cohorts requires large amounts of data and computational resources. However, the recent advent of genome-wide analysis tools has resulted in multiple large genome-wide datasets of STR variation spanning nearly two million genomic loci in thousands of individuals from diverse populations. Here we present WebSTR, a database of genetic variation and other characteristics of genome-wide STRs across human populations. WebSTR is based on reference panels of more than 1.7 million human STRs created with state of the art repeat annotation methods and can easily be extended to include additional cohorts or species. It currently contains data based on STR genotypes for individuals from the 1000 Genomes Project, H3Africa, the Genotype-Tissue Expression (GTEx) Project and colorectal cancer patients from the TCGA dataset. WebSTR is implemented as a relational database with programmatic access available through an API and a web portal for browsing data. The web portal is publicly available at https://webstr.ucsd.edu.
Collapse
Affiliation(s)
- Oxana Sachenkova Lundström
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden; Vildly AB, Kalmar, Sweden; Institute of Computational Life Sciences, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland. https://twitter.com/merenlin
| | - Max Adriaan Verbiest
- Institute of Computational Life Sciences, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Feifei Xia
- Institute of Computational Life Sciences, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland; Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland. https://twitter.com/Feifeix97
| | - Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Inti Zlobec
- Institute of Tissue Medicine and Pathology, University of Bern, Switzerland
| | - Maria Anisimova
- Institute of Computational Life Sciences, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA; Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
22
|
Kosugi S, Kamatani Y, Harada K, Tomizuka K, Momozawa Y, Morisaki T, Terao C. Detection of trait-associated structural variations using short-read sequencing. CELL GENOMICS 2023; 3:100328. [PMID: 37388916 PMCID: PMC10300613 DOI: 10.1016/j.xgen.2023.100328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 02/17/2023] [Accepted: 04/25/2023] [Indexed: 07/01/2023]
Abstract
Genomic structural variation (SV) affects genetic and phenotypic characteristics in diverse organisms, but the lack of reliable methods to detect SV has hindered genetic analysis. We developed a computational algorithm (MOPline) that includes missing call recovery combined with high-confidence SV call selection and genotyping using short-read whole-genome sequencing (WGS) data. Using 3,672 high-coverage WGS datasets, MOPline stably detected ∼16,000 SVs per individual, which is over ∼1.7-3.3-fold higher than previous large-scale projects while exhibiting a comparable level of statistical quality metrics. We imputed SVs from 181,622 Japanese individuals for 42 diseases and 60 quantitative traits. A genome-wide association study with the imputed SVs revealed 41 top-ranked or nearly top-ranked genome-wide significant SVs, including 8 exonic SVs with 5 novel associations and enriched mobile element insertions. This study demonstrates that short-read WGS data can be used to identify rare and common SVs associated with a variety of traits.
Collapse
Affiliation(s)
- Shunichi Kosugi
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
| | - Yoichiro Kamatani
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
| | - Katsutoshi Harada
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan
| | - Takayuki Morisaki
- Division of Molecular Pathology, Institute of Medical Science, The University of Tokyo, 4-6-1, Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan
| | | | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|