1
|
Loh CA, Shields DA, Schwing A, Evrony GD. High-fidelity, large-scale targeted profiling of microsatellites. Genome Res 2024; 34:1008-1026. [PMID: 39013593 PMCID: PMC11368184 DOI: 10.1101/gr.278785.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 07/11/2024] [Indexed: 07/18/2024]
Abstract
Microsatellites are highly mutable sequences that can serve as markers for relationships among individuals or cells within a population. The accuracy and resolution of reconstructing these relationships depends on the fidelity of microsatellite profiling and the number of microsatellites profiled. However, current methods for targeted profiling of microsatellites incur significant "stutter" artifacts that interfere with accurate genotyping, and sequencing costs preclude whole-genome microsatellite profiling of a large number of samples. We developed a novel method for accurate and cost-effective targeted profiling of a panel of more than 150,000 microsatellites per sample, along with a computational tool for designing large-scale microsatellite panels. Our method addresses the greatest challenge for microsatellite profiling-"stutter" artifacts-with a low-temperature hybridization capture that significantly reduces these artifacts. We also developed a computational tool for accurate genotyping of the resulting microsatellite sequencing data that uses an ensemble approach integrating three microsatellite genotyping tools, which we optimize by analysis of de novo microsatellite mutations in human trios. Altogether, our suite of experimental and computational tools enables high-fidelity, large-scale profiling of microsatellites, which may find utility in diverse applications such as lineage tracing, population genetics, ecology, and forensics.
Collapse
Affiliation(s)
- Caitlin A Loh
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Danielle A Shields
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Adam Schwing
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Gilad D Evrony
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA;
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| |
Collapse
|
2
|
Tanudisastro HA, Deveson IW, Dashnow H, MacArthur DG. Sequencing and characterizing short tandem repeats in the human genome. Nat Rev Genet 2024; 25:460-475. [PMID: 38366034 DOI: 10.1038/s41576-024-00692-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 02/18/2024]
Abstract
Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.
Collapse
Affiliation(s)
- Hope A Tanudisastro
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Ira W Deveson
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia
- Genomics and Inherited Disease Program, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
- Faculty of Medicine and Health, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
3
|
Fazzari V, Moo-Choy A, Panoyan MA, Abbatangelo CL, Polimanti R, Novroski NM, Wendt FR. Multi-ancestry tandem repeat association study of hair colour using exome-wide sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581865. [PMID: 38464141 PMCID: PMC10925195 DOI: 10.1101/2024.02.24.581865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Hair colour variation is influenced by hundreds of positions across the human genome but this genetic contribution has only been narrowly explored. Genome-wide association studies identified single nucleotide polymorphisms (SNPs) influencing hair colour but the biology underlying these associations is challenging to interpret. We report 16 tandem repeats (TRs) with effects on different models of hair colour plus two TRs associated with hair colour in diverse ancestry groups. Several of these TRs expand or contract amino acid coding regions of their localized protein such that structure, and by extension function, may be altered. We also demonstrate that independent of SNP variation, these TRs can be used to great an additive polygenic score that predicts darker hair colour. This work adds to the growing body of evidence regarding TR influence on human traits with relatively large and independent effects relative to surrounding SNP variation.
Collapse
|
4
|
Birnbaum R. Rediscovering tandem repeat variation in schizophrenia: challenges and opportunities. Transl Psychiatry 2023; 13:402. [PMID: 38123544 PMCID: PMC10733427 DOI: 10.1038/s41398-023-02689-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/23/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open
Abstract
Tandem repeats (TRs) are prevalent throughout the genome, constituting at least 3% of the genome, and often highly polymorphic. The high mutation rate of TRs, which can be orders of magnitude higher than single-nucleotide polymorphisms and indels, indicates that they are likely to make significant contributions to phenotypic variation, yet their contribution to schizophrenia has been largely ignored by recent genome-wide association studies (GWAS). Tandem repeat expansions are already known causative factors for over 50 disorders, while common tandem repeat variation is increasingly being identified as significantly associated with complex disease and gene regulation. The current review summarizes key background concepts of tandem repeat variation as pertains to disease risk, elucidating their potential for schizophrenia association. An overview of next-generation sequencing-based methods that may be applied for TR genome-wide identification is provided, and some key methodological challenges in TR analyses are delineated.
Collapse
Affiliation(s)
- Rebecca Birnbaum
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
5
|
Grether A, Ivanovski I, Russo M, Begemann A, Steindl K, Abela L, Papik M, Zweier M, Oneda B, Joset P, Rauch A. The current benefit of genome sequencing compared to exome sequencing in patients with developmental or epileptic encephalopathies. Mol Genet Genomic Med 2023; 11:e2148. [PMID: 36785910 PMCID: PMC10178799 DOI: 10.1002/mgg3.2148] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 02/15/2023] Open
Abstract
BACKGROUND As the technology of next generation sequencing rapidly develops and costs are constantly reduced, the clinical availability of whole genome sequencing (WGS) increases. Thereby, it remains unclear what exact advantage WGS offers in comparison to whole exome sequencing (WES) for the diagnosis of genetic diseases using current technologies. METHODS Trio-WGS was conducted for 20 patients with developmental or epileptic encephalopathies who remained undiagnosed after WES and chromosomal microarray analysis. RESULTS A diagnosis was reached for four patients (20%). However, retrospectively all pathogenic variants could have been detected in a WES analysis conducted with today's methods and knowledge. CONCLUSION The additional diagnostic yield of WGS versus WES is currently largely explained by new scientific insights and the general technological progress. Nevertheless, it is noteworthy that whole genome sequencing has greater potential for the analysis of small copy number and copy number neutral variants not seen with WES as well as variants in noncoding regions, especially as potentially more knowledge of the function of noncoding regions arises. We, therefore, conclude that even though today the added value of WGS versus WES seems to be limited, it may increase substantially in the future.
Collapse
Affiliation(s)
- Anna Grether
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Ivan Ivanovski
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Martina Russo
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Anaïs Begemann
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | | | - Lucia Abela
- Division of Child NeurologyUniversity Children's Hospital ZurichZurichSwitzerland
| | - Michael Papik
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Markus Zweier
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Beatrice Oneda
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
| | - Pascal Joset
- Medical Genetics, Institute of Medical Genetics and PathologyUniversity Hospital BaselBaselSwitzerland
| | - Anita Rauch
- Institute of Medical GeneticsUniversity of ZurichZurichSwitzerland
- University Children's Hospital ZurichZurichSwitzerland
- University of Zurich Clinical Research Priority Program (CRPP) Praeclare – Personalized prenatal and reproductive medicineZurichSwitzerland
- University of Zurich Research Priority Program (URPP) AdaBD: Adaptive Brain Circuits in Development and LearningZurichSwitzerland
- University of Zurich Research Priority Program (URPP) ITINERARE: Innovative Therapies in Rare DiseasesZurichSwitzerland
| |
Collapse
|
6
|
Microsatellite Genome-Wide Database Development for the Commercial Blackhead Seabream (Acanthopagrus schlegelii). Genes (Basel) 2023; 14:genes14030620. [PMID: 36980892 PMCID: PMC10048070 DOI: 10.3390/genes14030620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 02/26/2023] [Accepted: 02/27/2023] [Indexed: 03/05/2023] Open
Abstract
Simple sequence repeats (SSRs), the markers with the highest polymorphism and co-dominance degrees, offer a crucial genetic research resource. Limited SSR markers in blackhead seabream have been reported. The availability of the blackhead seabream genome assembly provided the opportunity to carry out genome-wide identification for all microsatellite markers, and bioinformatic analyses open the way for developing a microsatellite genome-wide database in blackhead seabream. In this study, a total of 412,381 SSRs were identified in the 688.08 Mb genome by Krait software. Whole-genome sequences (10×) of 42 samples were aligned against the reference genome and genotyped using the HipSTR tools by comparing and counting repeat number variation across the SSR loci. A total of 156,086 SSRs with a 2–4 bp repeat were genotyped by HipSTR tools, which accounted for 55.78% of the 2–4 bp SSRs in the reference genome. High accuracy of genotyping was observed by comparing HipSTR tools and PCR amplification. A set of 109,131 loci with a number of alleles ≥ 3 and with a number of genotyped individuals ≥ 6 were reserved to constitute the polymorphic SSR database. Fifty-one polymorphic SSR loci were identified through PCR amplification. This strategy to develop polymorphic SSR markers not only obtained a large set of polymorphic SSRs but also eliminated the need for laborious experimental screening. SSR markers developed in this study may facilitate blackhead seabream research, which lays a certain foundation for further gene tagging and genetic linkage analysis, such as marker-assisted selection, genetic mapping, as well as comparative genomic analysis.
Collapse
|
7
|
Wang H, Wang LS, Schellenberg G, Lee WP. The role of structural variations in Alzheimer's disease and other neurodegenerative diseases. Front Aging Neurosci 2023; 14:1073905. [PMID: 36846102 PMCID: PMC9944073 DOI: 10.3389/fnagi.2022.1073905] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/31/2022] [Indexed: 02/10/2023] Open
Abstract
Dozens of single nucleotide polymorphisms (SNPs) related to Alzheimer's disease (AD) have been discovered by large scale genome-wide association studies (GWASs). However, only a small portion of the genetic component of AD can be explained by SNPs observed from GWAS. Structural variation (SV) can be a major contributor to the missing heritability of AD; while SV in AD remains largely unexplored as the accurate detection of SVs from the widely used array-based and short-read technology are still far from perfect. Here, we briefly summarized the strengths and weaknesses of available SV detection methods. We reviewed the current landscape of SV analysis in AD and SVs that have been found associated with AD. Particularly, the importance of currently less explored SVs, including insertions, inversions, short tandem repeats, and transposable elements in neurodegenerative diseases were highlighted.
Collapse
Affiliation(s)
- Hui Wang
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Li-San Wang
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Gerard Schellenberg
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Wan-Ping Lee
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
8
|
Wendt FR, Pathak GA, Polimanti R. Phenome-wide association study of loci harboring de novo tandem repeat mutations in UK Biobank exomes. Nat Commun 2022; 13:7682. [PMID: 36509785 PMCID: PMC9744822 DOI: 10.1038/s41467-022-35423-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 12/02/2022] [Indexed: 12/15/2022] Open
Abstract
When present in coding regions, tandem repeats (TRs) may have large effects on protein structure and function contributing to health and disease. We use a family-based design to identify de novo TRs and assess their impact at the population level in 148,607 European ancestry participants from the UK Biobank. The 427 loci with de novo TR mutations are enriched for targets of microRNA-184 (21.1-fold, P = 4.30 × 10-5, FDR = 9.50 × 10-3). There are 123 TR-phenotype associations with posterior probabilities > 0.95. These relate to body structure, cognition, and cardiovascular, metabolic, psychiatric, and respiratory outcomes. We report several loci with large likely causal effects on tissue microstructure, including the FAN1-[TG]N and carotid intima-media thickness (mean thickness: beta = 5.22, P = 1.22 × 10-6, FDR = 0.004; maximum thickness: beta = 6.44, P = 1.12 × 10-6, FDR = 0.004). Two exonic repeats FNBP4-[GGT]N and BTN2A1-[CCT]N alter protein structure. In this work, we contribute clear and testable hypotheses of dose-dependent TR implications linking genetic variation and protein structure with health and disease outcomes.
Collapse
Affiliation(s)
- Frank R Wendt
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada.
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
- Forensic Science Program, University of Toronto, Mississauga, ON, Canada.
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.
- VA CT Healthcare System, West Haven, CT, USA.
| | - Gita A Pathak
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- VA CT Healthcare System, West Haven, CT, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
- VA CT Healthcare System, West Haven, CT, USA
| |
Collapse
|
9
|
Frontanilla TS, Valle-Silva G, Ayala J, Mendes-Junior CT. Open-Access Worldwide Population STR Database Constructed Using High-Coverage Massively Parallel Sequencing Data Obtained from the 1000 Genomes Project. Genes (Basel) 2022; 13:genes13122205. [PMID: 36553472 PMCID: PMC9778533 DOI: 10.3390/genes13122205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 11/13/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022] Open
Abstract
Achieving accurate STR genotyping by using next-generation sequencing data has been challenging. To provide the forensic genetics community with a reliable open-access STR database, we conducted a comprehensive genotyping analysis of a set of STRs of broad forensic interest obtained from 1000 Genome populations. We analyzed 22 STR markers using files of the high-coverage dataset of Phase 3 of the 1000 Genomes Project. We used HipSTR to call genotypes from 2504 samples obtained from 26 populations. We were not able to detect the D21S11 marker. The Hardy-Weinberg equilibrium analysis coupled with a comprehensive analysis of allele frequencies revealed that HipSTR was not able to identify longer alleles, which resulted in heterozygote deficiency. Nevertheless, AMOVA, a clustering analysis that uses STRUCTURE, and a Principal Coordinates Analysis showed a clear-cut separation between the four major ancestries sampled by the 1000 Genomes Consortium. Except for larger Penta D and Penta E alleles, and two very small Penta D alleles (2.2 and 3.2) usually observed in African populations, our analyses revealed that allele frequencies and genotypes offered as an open-access database are consistent and reliable.
Collapse
Affiliation(s)
- Tamara Soledad Frontanilla
- Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14049-900, SP, Brazil
| | - Guilherme Valle-Silva
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14040-901, SP, Brazil
| | - Jesus Ayala
- Facultad de Ingeniería Informática, Universidad de la Integración de las Americas, Asunción 00120-6, Paraguay
| | - Celso Teixeira Mendes-Junior
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14040-901, SP, Brazil
- Correspondence:
| |
Collapse
|
10
|
Annear DJ, Vandeweyer G, Sanchis-Juan A, Raymond FL, Kooy RF. Non-Mendelian inheritance patterns and extreme deviation rates of CGG repeats in autism. Genome Res 2022; 32:1967-1980. [PMID: 36351771 PMCID: PMC9808627 DOI: 10.1101/gr.277011.122] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 10/14/2022] [Indexed: 11/10/2022]
Abstract
As expansions of CGG short tandem repeats (STRs) are established as the genetic etiology of many neurodevelopmental disorders, we aimed to elucidate the inheritance patterns and role of CGG STRs in autism-spectrum disorder (ASD). By genotyping 6063 CGG STR loci in a large cohort of trios and quads with an ASD-affected proband, we determined an unprecedented rate of CGG repeat length deviation across a single generation. Although the concept of repeat length being linked to deviation rate was solidified, we show how shorter STRs display greater degrees of size variation. We observed that CGG STRs did not segregate by Mendelian principles but with a bias against longer repeats, which appeared to magnify as repeat length increased. Through logistic regression, we identified 19 genes that displayed significantly higher rates and degrees of CGG STR expansion within the ASD-affected probands (P < 1 × 10-5). This study not only highlights novel repeat expansions that may play a role in ASD but also reinforces the hypothesis that CGG STRs are specifically linked to human cognition.
Collapse
Affiliation(s)
- Dale J. Annear
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| | - Geert Vandeweyer
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| | - Alba Sanchis-Juan
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, United Kingdom;,Department of Haematology, University of Cambridge, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, United Kingdom
| | - F. Lucy Raymond
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, United Kingdom;,Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, United Kingdom
| | - R. Frank Kooy
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| |
Collapse
|
11
|
An Introductory Overview of Open-Source and Commercial Software Options for the Analysis of Forensic Sequencing Data. Genes (Basel) 2021; 12:genes12111739. [PMID: 34828345 PMCID: PMC8618049 DOI: 10.3390/genes12111739] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 10/27/2021] [Accepted: 10/27/2021] [Indexed: 12/30/2022] Open
Abstract
The top challenges of adopting new methods to forensic DNA analysis in routine laboratories are often the capital investment and the expertise required to implement and validate such methods locally. In the case of next-generation sequencing, in the last decade, several specifically forensic commercial options became available, offering reliable and validated solutions. Despite this, the readily available expertise to analyze, interpret and understand such data is still perceived to be lagging behind. This review gives an introductory overview for the forensic scientists who are at the beginning of their journey with implementing next-generation sequencing locally and because most in the field do not have a bioinformatics background may find it difficult to navigate the new terms and analysis options available. The currently available open-source and commercial software for forensic sequencing data analysis are summarized here to provide an accessible starting point for those fairly new to the forensic application of massively parallel sequencing.
Collapse
|
12
|
Rajan-Babu IS, Peng JJ, Chiu R, Li C, Mohajeri A, Dolzhenko E, Eberle MA, Birol I, Friedman JM. Genome-wide sequencing as a first-tier screening test for short tandem repeat expansions. Genome Med 2021; 13:126. [PMID: 34372915 PMCID: PMC8351082 DOI: 10.1186/s13073-021-00932-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 07/05/2021] [Indexed: 02/01/2023] Open
Abstract
Background Screening for short tandem repeat (STR) expansions in next-generation sequencing data can enable diagnosis, optimal clinical management/treatment, and accurate genetic counseling of patients with repeat expansion disorders. We aimed to develop an efficient computational workflow for reliable detection of STR expansions in next-generation sequencing data and demonstrate its clinical utility. Methods We characterized the performance of eight STR analysis methods (lobSTR, HipSTR, RepeatSeq, ExpansionHunter, TREDPARSE, GangSTR, STRetch, and exSTRa) on next-generation sequencing datasets of samples with known disease-causing full-mutation STR expansions and genomes simulated to harbor repeat expansions at selected loci and optimized their sensitivity. We then used a machine learning decision tree classifier to identify an optimal combination of methods for full-mutation detection. In Burrows-Wheeler Aligner (BWA)-aligned genomes, the ensemble approach of using ExpansionHunter, STRetch, and exSTRa performed the best (precision = 82%, recall = 100%, F1-score = 90%). We applied this pipeline to screen 301 families of children with suspected genetic disorders. Results We identified 10 individuals with full-mutations in the AR, ATXN1, ATXN8, DMPK, FXN, or HTT disease STR locus in the analyzed families. Additional candidates identified in our analysis include two probands with borderline ATXN2 expansions between the established repeat size range for reduced-penetrance and full-penetrance full-mutation and seven individuals with FMR1 CGG repeats in the intermediate/premutation repeat size range. In 67 probands with a prior negative clinical PCR test for the FMR1, FXN, or DMPK disease STR locus, or the spinocerebellar ataxia disease STR panel, our pipeline did not falsely identify aberrant expansion. We performed clinical PCR tests on seven (out of 10) full-mutation samples identified by our pipeline and confirmed the expansion status in all, showing absolute concordance between our bioinformatics and molecular findings. Conclusions We have successfully demonstrated the application of a well-optimized bioinformatics pipeline that promotes the utility of genome-wide sequencing as a first-tier screening test to detect expansions of known disease STRs. Interrogating clinical next-generation sequencing data for pathogenic STR expansions using our ensemble pipeline can improve diagnostic yield and enhance clinical outcomes for patients with repeat expansion disorders. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00932-9.
Collapse
Affiliation(s)
- Indhu-Shree Rajan-Babu
- Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada. .,Department of Medical and Molecular Genetics, King's College London, Strand, London, WC2R 2LS, UK.
| | - Junran J Peng
- Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada
| | - Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z4S6, Canada
| | | | | | - Chenkai Li
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z4S6, Canada.,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, V6T1Z4, Canada
| | - Arezoo Mohajeri
- Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada
| | | | | | - Inanc Birol
- Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada.,Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z4S6, Canada
| | - Jan M Friedman
- Department of Medical Genetics, University of British Columbia and Children's & Women's Hospital, Vancouver, BC, V6H3N1, Canada
| |
Collapse
|
13
|
Novel KCNQ4 variants in different functional domains confer genotype- and mechanism-based therapeutics in patients with nonsyndromic hearing loss. Exp Mol Med 2021; 53:1192-1204. [PMID: 34316018 PMCID: PMC8333092 DOI: 10.1038/s12276-021-00653-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 05/13/2021] [Accepted: 06/09/2021] [Indexed: 02/07/2023] Open
Abstract
Loss-of-function variant in the gene encoding the KCNQ4 potassium channel causes autosomal dominant nonsyndromic hearing loss (DFNA2), and no effective pharmacotherapeutics have been developed to reverse channel activity impairment. Phosphatidylinositol 4,5-bisphosphate (PIP2), an obligatory phospholipid for maintaining KCNQ channel activity, confers differential pharmacological sensitivity of channels to KCNQ openers. Through whole-exome sequencing of DFNA2 families, we identified three novel KCNQ4 variants related to diverse auditory phenotypes in the proximal C-terminus (p.Arg331Gln), the C-terminus of the S6 segment (p.Gly319Asp), and the pore region (p.Ala271_Asp272del). Potassium currents in HEK293T cells expressing each KCNQ4 variant were recorded by patch-clamp, and functional recovery by PIP2 expression or KCNQ openers was examined. In the homomeric expression setting, the three novel KCNQ4 mutant proteins lost conductance and were unresponsive to KCNQ openers or PIP2 expression. Loss of p.Arg331Gln conductance was slightly restored by a tandem concatemer channel (WT-p.R331Q), and increased PIP2 expression further increased the concatemer current to the level of the WT channel. Strikingly, an impaired homomeric p.Gly319Asp channel exhibited hyperactivity when a concatemer (WT-p.G319D), with a negative shift in the voltage dependence of activation. Correspondingly, a KCNQ inhibitor and chelation of PIP2 effectively downregulated the hyperactive WT-p.G319D concatemer channel. Conversely, the pore-region variant (p.Ala271_Asp272del) was nonrescuable under any condition. Collectively, these novel KCNQ4 variants may constitute therapeutic targets that can be manipulated by the PIP2 level and KCNQ-regulating drugs under the physiological context of heterozygous expression. Our research contributes to the establishment of a genotype/mechanism-based therapeutic portfolio for DFNA2.
Collapse
|
14
|
Chintalaphani SR, Pineda SS, Deveson IW, Kumar KR. An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics. Acta Neuropathol Commun 2021; 9:98. [PMID: 34034831 PMCID: PMC8145836 DOI: 10.1186/s40478-021-01201-x] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 05/17/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short tandem repeat (STR) expansion disorders are an important cause of human neurological disease. They have an established role in more than 40 different phenotypes including the myotonic dystrophies, Fragile X syndrome, Huntington's disease, the hereditary cerebellar ataxias, amyotrophic lateral sclerosis and frontotemporal dementia. MAIN BODY STR expansions are difficult to detect and may explain unsolved diseases, as highlighted by recent findings including: the discovery of a biallelic intronic 'AAGGG' repeat in RFC1 as the cause of cerebellar ataxia, neuropathy, and vestibular areflexia syndrome (CANVAS); and the finding of 'CGG' repeat expansions in NOTCH2NLC as the cause of neuronal intranuclear inclusion disease and a range of clinical phenotypes. However, established laboratory techniques for diagnosis of repeat expansions (repeat-primed PCR and Southern blot) are cumbersome, low-throughput and poorly suited to parallel analysis of multiple gene regions. While next generation sequencing (NGS) has been increasingly used, established short-read NGS platforms (e.g., Illumina) are unable to genotype large and/or complex repeat expansions. Long-read sequencing platforms recently developed by Oxford Nanopore Technology and Pacific Biosciences promise to overcome these limitations to deliver enhanced diagnosis of repeat expansion disorders in a rapid and cost-effective fashion. CONCLUSION We anticipate that long-read sequencing will rapidly transform the detection of short tandem repeat expansion disorders for both clinical diagnosis and gene discovery.
Collapse
Affiliation(s)
- Sanjog R. Chintalaphani
- School of Medicine, University of New South Wales, Sydney, 2052 Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
| | - Sandy S. Pineda
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Brain and Mind Centre, University of Sydney, Camperdown, NSW 2050 Australia
| | - Ira W. Deveson
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Faculty of Medicine, St Vincent’s Clinical School, University of New South Wales, Sydney, NSW 2010 Australia
| | - Kishore R. Kumar
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Molecular Medicine Laboratory and Neurology Department, Central Clinical School, Concord Repatriation General Hospital, University of Sydney, Concord, NSW 2137 Australia
| |
Collapse
|
15
|
van der Sanden BPGH, Corominas J, de Groot M, Pennings M, Meijer RPP, Verbeek N, van de Warrenburg B, Schouten M, Yntema HG, Vissers LELM, Kamsteeg EJ, Gilissen C. Systematic analysis of short tandem repeats in 38,095 exomes provides an additional diagnostic yield. Genet Med 2021; 23:1569-1573. [PMID: 33846582 DOI: 10.1038/s41436-021-01174-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 03/29/2021] [Accepted: 03/30/2021] [Indexed: 12/31/2022] Open
Abstract
PURPOSE Expansions of a subset of short tandem repeats (STRs) have been implicated in approximately 30 different human genetic disorders. Despite extensive application of exome sequencing (ES) in routine diagnostic genetic testing, STRs are not routinely identified from these data. METHODS We assessed diagnostic utility of STR analysis in exome sequencing by applying ExpansionHunter to 2,867 exomes from movement disorder patients and 35,228 other clinical exomes. RESULTS We identified 38 movement disorder patients with a possible aberrant STR length. Validation by polymerase chain reaction (PCR) and/or repeat-primed PCR technologies confirmed the presence of aberrant expansion alleles for 13 (34%). For seven of these patients the genotype was compatible with the phenotypic description, resulting in a molecular diagnosis. We subsequently tested the remainder of our diagnostic ES cohort, including over 30 clinically and genetically heterogeneous disorders. Optimized manual curation yielded 167 samples with a likely aberrant STR length. Validations confirmed 93/167 (56%) aberrant expansion alleles, of which 48 were in the pathogenic range and 45 in the premutation range. CONCLUSION Our work provides guidance for the implementation of STR analysis in clinical ES. Our results show that systematic STR evaluation may increase diagnostic ES yield by 0.2%, and recommend making STR evaluation a routine part of ES interpretation in genetic testing laboratories.
Collapse
Affiliation(s)
- Bart P G H van der Sanden
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Jordi Corominas
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Michelle de Groot
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Maartje Pennings
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Rowdy P P Meijer
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Nienke Verbeek
- Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Bart van de Warrenburg
- Department of Neurology, Donders Institute for Brain, Cognition and Behaviour, Radboud university medical center, Nijmegen, The Netherlands
| | - Meyke Schouten
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Helger G Yntema
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Lisenka E L M Vissers
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Erik-Jan Kamsteeg
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands.
| | - Christian Gilissen
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands. .,Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands.
| |
Collapse
|
16
|
Lee SY, Han JH, Carandang M, Kim MY, Kim B, Yi N, Kim J, Kim BJ, Oh DY, Koo JW, Lee JH, Oh SH, Choi BY. Novel genotype-phenotype correlation of functionally characterized LMX1A variants linked to sensorineural hearing loss. Hum Mutat 2020; 41:1877-1883. [PMID: 32840933 DOI: 10.1002/humu.24095] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 08/10/2020] [Accepted: 08/19/2020] [Indexed: 12/15/2022]
Abstract
LMX1A, encoding the LIM homeobox transcription factor, is essential for inner ear development. Despite previous reports of three human LMX1A variants with nonsyndromic hearing loss (NSHL) in the literature, functional characterization of these variants has never been performed. Encouraged by identification of a de novo, heterozygous, missense variant (c.595A > G; p.Arg199Gly) located in the homeodomain of LMX1A in a subject with congenital severe-to-profound deafness through Exome sequencing, we performed luciferase assay to evaluate transcriptional activity of all LMX1A variants reported in the literature including p.Arg199Gly. Resultantly, p.Arg199Gly manifesting the most severe NSHL showed the biggest reduction of transcriptional activity in contrast with moderately reduced activity of p.Cys97Ser and p.Val241Leu associated with less severe progressive NSHL, proposing a genotype-phenotype correlation. Further, our dominant LMX1A variant exerted pathogenic effects via haploinsufficiency rather than dominant-negative effect. Collectively, we provide a potential genotype-phenotype correlation of LMX1A variants as well as the pathogenic mechanism of LMX1A-related NSHL.
Collapse
Affiliation(s)
- Sang-Yeon Lee
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Jin Hee Han
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Marge Carandang
- Department of Otorhinolaryngology-Head and Neck Surgery, East Avenue Medical Center, Metro Manila, Philippines
| | - Min Young Kim
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Bonggi Kim
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Nayoung Yi
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Jinho Kim
- Clinical Precision Medicine Center, Future Innovation Research Division, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Bong Jik Kim
- Department of Otolaryngology-Head and Neck Surgery, College of Medicine, Chungnam National University, Daejeon, Korea
| | - Doo-Yi Oh
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Ja-Won Koo
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| | - Jun Ho Lee
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Seung-Ha Oh
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Byung Yoon Choi
- Department of Otorhinolaryngology-Head and Neck Surgery, Bundang Hospital, College of Medicine, Seoul National University, Seongnam, Korea
| |
Collapse
|