1
|
Xu IR, Danzi MC, Raposo J, Züchner S. The continued promise of genomic technologies and software in neurogenetics. J Neuromuscul Dis 2025:22143602251325345. [PMID: 40208247 DOI: 10.1177/22143602251325345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2025]
Abstract
The continued evolution of genomic technologies over the past few decades has revolutionized the field of neurogenetics, offering profound insights into the genetic underpinnings of neurological disorders. Identification of causal genes for numerous monogenic neurological conditions has informed key aspects of disease mechanisms and facilitated research into critical proteins and molecular pathways, laying the groundwork for therapeutic interventions. However, the question remains: has this transformative trend reached its zenith? In this review, we suggest that despite significant strides in genome sequencing and advanced computational analyses, there is still ample room for methodological refinement. We anticipate further major genetic breakthroughs corresponding with the increased use of long-read genomes, variant calling software, AI tools, and data aggregation databases. Genetic progress has historically been driven by technological advancements from the commercial sector, which are developed in response to academic research needs, creating a continuous cycle of innovation and discovery. This review explores the potential of genomic technologies to address the challenges of neurogenetic disorders. By outlining both established and modern resources, we aim to emphasize the importance of genetic technologies as we enter an era poised for discoveries.
Collapse
Affiliation(s)
- Isaac Rl Xu
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Matt C Danzi
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Jacquelyn Raposo
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Stephan Züchner
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| |
Collapse
|
2
|
Tao R, Ma J, Qian J, Liu Y, Zhang W, Lavelle D, Wang X, Yan W, Michelmore RW, Chen J, Kuang H. Differential methylation of a retrotransposon upstream of a MYB gene causes variegation of lettuce leaves, which is abolished by the presence of an (AT) 5 repeat in the promoter. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2025; 122:e70123. [PMID: 40162932 DOI: 10.1111/tpj.70123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2025] [Revised: 03/12/2025] [Accepted: 03/17/2025] [Indexed: 04/02/2025]
Abstract
Variegation, a common phenomenon in plants, can be the result of several genetic, developmental, and physiological factors. Leaves of some lettuce cultivars exhibit dramatic red variegation; however, the genetic mechanisms underlying this variegation remain unknown. In this study, we cloned the causal gene for variegation on lettuce leaves and elucidated the underlying molecular mechanisms. Genetic analysis revealed that the polymorphism of variegated versus uniformly red leaves is caused by an "AT" repeat in the promoter of the RLL2A gene encoding a MYB transcription factor. Complementation tests demonstrated that the RLL2A allele (RLL2AV) with (AT)n repeat numbers other than five led to variegated leaves. RLL2AV was expressed in the red spots but not in neighboring green regions. This expression pattern was in concert with a relatively low level of methylation in a retrotransposon inserted in -761 bp of the gene in the red spots compared to high methylation of the retrotransposon in the green region. The presence of (AT)5 in the promoter region, however, stabilized the expression of RLL2A, resulting in uniformly red leaves. In summary, we identified a novel promoter mechanism controlling variegation through inconsistent levels of methylation and showed that the presence of a simple sequence repeat of specific size could stabilize gene expression.
Collapse
Affiliation(s)
- Rong Tao
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jiaojiao Ma
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jinlong Qian
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yali Liu
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Weiyi Zhang
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Dean Lavelle
- Genome Center and Department of Plant Sciences, University of California, Davis, Davis, California, 95616, USA
| | - Xin Wang
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wenhao Yan
- College of Plant Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Richard W Michelmore
- Genome Center and Department of Plant Sciences, University of California, Davis, Davis, California, 95616, USA
| | - Jiongjiong Chen
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hanhui Kuang
- National Key Laboratory for Germplasm Innovation and Utilization of Horticultural Crops, Hubei Hongshan Laboratory, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
3
|
Kawahara R, Morishita S. Approximating edit distances between complex tandem repeats efficiently. Bioinformatics 2025; 41:btaf155. [PMID: 40203069 PMCID: PMC12014093 DOI: 10.1093/bioinformatics/btaf155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 01/24/2025] [Accepted: 04/07/2025] [Indexed: 04/11/2025] Open
Abstract
MOTIVATION Extended tandem repeats (TRs) have been associated with 60 or more diseases over the past 30 years. Although most TRs have single repeat units (or motifs), complex TRs with different units have recently been correlated with some brain disorders. Of note, a population-scale analysis shows that complex TRs at one locus can be divergent, and different units are often expanded between individuals. To understand the evolution of high TR diversity, it is informative to visualize a phylogenetic tree. To do this, we need to measure the edit distance between pairs of complex TRs by considering duplication and contraction of units created by replication slippage. However, traditional rigorous algorithms for this purpose are computationally expensive. RESULTS We here propose an efficient heuristic algorithm to estimate the edit distance with duplication and contraction of units (EDDC, for short). We select a set of frequent units that occur in given complex TRs, encode each unit as a single symbol, compress a TR into an optimal series of unit symbols that partially matches the original TR with the minimum Levenshtein distance, and estimate the EDDC between a pair of complex TRs from their compressed forms. Using substantial synthetic benchmark datasets, we demonstrate that the estimated EDDC is highly correlated with the accurate EDDC, with a Pearson correlation coefficient of >0.983, while the heuristic algorithm achieves orders of magnitude performance speedup. AVAILABILITY AND IMPLEMENTATION The software program hEDDC that implements the proposed algorithm is available at https://github.com/Ricky-pon/hEDDC (DOI: 10.5281/zenodo.14732958).
Collapse
Affiliation(s)
- Riki Kawahara
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| |
Collapse
|
4
|
Hou Q, Ji W, An K, Tan Y, Liu P, Su J. Genomic microsatellite characterization and development of polymorphic microsatellites in Eospalax baileyi. Sci Rep 2025; 15:524. [PMID: 39747356 PMCID: PMC11696105 DOI: 10.1038/s41598-024-84631-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Accepted: 12/25/2024] [Indexed: 01/04/2025] Open
Abstract
Microsatellite markers are cost-effective, rapid, efficient, and show great advantages in in large-sample kinship analysis and population structure studies. However, microsatellite loci are seriously underdeveloped in non-model organisms. The plateau zokor (Eospalax baileyi) is a key species living underground in the Tibetan Plateau, the effective management of which has long been challenging. In this study, we analyzed the distribution characteristics and functions of microsatellites in the genome of plateau zokors, and their polymorphic sites. The mononucleotide and dinucleotide types being the most abundant in the genome. The largest number of microsatellites and their abundance in the intergenic region whereas the smallest number of microsatellites and their abundance in the coding region. The coding sequences containing microsatellites were annotated to 52 major functional genes and assigned 19,358 Gene Ontology entries. The Kyoto Encyclopedia of Genes and Genomes pathway was the most enriched in the signal transduction pathway. Thirteen pairs of polymorphic loci were successfully amplified, with the number of alleles ranging from 3 to 8, observed heterozygosity ranging from 0.059 to 0.810, and expected heterozygosity ranging from 0.469 to 0.854. These microsatellite markers provide a cornerstone for studies on the identification of parentage and population genetics of plateau zokors.
Collapse
Affiliation(s)
- Qiqi Hou
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Weihong Ji
- Faculty of Science, University of Auckland, Auckland, New Zealand
| | - Kang An
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Yuchen Tan
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Penghui Liu
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China
| | - Junhu Su
- Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China.
- Gansu Agricultural University-Massey University Research Centre for Grassland Biodiversity, Gansu Agricultural University, Lanzhou, 730070, China.
- Gansu Qilianshan Grassland Ecosystem Observation and Research Station, Wuwei, 733200, China.
| |
Collapse
|
5
|
Watson JL, Cho K, Grisedale K, Ward J, McNevin D. Characterisation of identity-informative genetic markers in the Australian population with European ancestry. Forensic Sci Int Genet 2025; 74:103169. [PMID: 39476449 DOI: 10.1016/j.fsigen.2024.103169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 10/10/2024] [Accepted: 10/27/2024] [Indexed: 12/29/2024]
Abstract
Identity-informative single nucleotide polymorphisms (iiSNPs) are valuable genetic markers for human identification and kinship testing in forensic casework, especially when the quality and quantity of DNA evidence is not suitable for routine short tandem repeat (STR) profiling. This study analysed 105 buccal samples representing the Australian population with European ancestry in order to assign allele frequencies and conduct population genetic analyses for 94 iiSNPs and 20 STRs. The markers were assessed by calculating relevant forensic statistics and testing for deviations from Hardy-Weinberg and linkage equilibrium. No linkage of statistical significance was observed between any of the pair-wise combinations of the combined 114 identity-informative markers and only one STR exhibited deviation from Hardy-Weinberg equilibrium (D8S1179). The probability of matching genotypes being observed within this population was of the order of 10-23 for STRs, 10-38 for iiSNPs and 10-60 for the combined identity-informative marker panel, improving the ability to discriminate between individuals when calculating likelihood ratios in direct or indirect matching scenarios. Further, the addition of iiSNPs will facilitate identifications when suboptimal STR profiles are recovered from compromised or challenging samples and aid comparisons to genetic relatives for familial or kinship testing.
Collapse
Affiliation(s)
- Jessica L Watson
- National DNA Program for Unidentified and Missing Persons, Australian Federal Police, Australia; Centre for Forensic Science, School of Mathematical & Physical Science, Faculty of Science, University of Technology Sydney, Australia; Biology, AFP Forensics, Australian Federal Police, Australia.
| | - Kaymann Cho
- Biology, AFP Forensics, Australian Federal Police, Australia
| | - Kelly Grisedale
- National DNA Program for Unidentified and Missing Persons, Australian Federal Police, Australia; Biology, AFP Forensics, Australian Federal Police, Australia
| | - Jodie Ward
- National DNA Program for Unidentified and Missing Persons, Australian Federal Police, Australia; Centre for Forensic Science, School of Mathematical & Physical Science, Faculty of Science, University of Technology Sydney, Australia
| | - Dennis McNevin
- National DNA Program for Unidentified and Missing Persons, Australian Federal Police, Australia; Centre for Forensic Science, School of Mathematical & Physical Science, Faculty of Science, University of Technology Sydney, Australia
| |
Collapse
|
6
|
Rozenova KA, Buza N, Hui P. Gestational trophoblastic disease: STR genotyping for precision diagnosis. Expert Rev Mol Diagn 2025; 25:1-19. [PMID: 39801212 DOI: 10.1080/14737159.2025.2453506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Accepted: 12/28/2024] [Indexed: 01/21/2025]
Abstract
INTRODUCTION Gestational trophoblastic disease (GTD) encompasses a constellation of rare to common gynecologic conditions stemming from aberrant gestations with distinct genetic backgrounds and variable degrees of trophoblast proliferation of either neoplastic or non-neoplastic nature. GTD is categorized into hydatidiform moles and gestational trophoblastic neoplasms, and their clinical outcomes vary widely across different subtypes. Prompt and accurate diagnosis plays a pivotal role in the effective management and prognostication of patients. Short tandem repeats (STRs) are repetitive DNA sequences dispersed throughout the human genome and inherit a tremendous genetic polymorphism among individuals. Widely recognized for its applications in forensic identity and paternity testing, the relevance of STR genotyping in the diagnosis of GTD has emerged as an essential ancillary test in the classification and management of GTD of both non-neoplastic hydatidiform moles and gestational trophoblastic tumors. AREA COVERED This review discusses fundamental principles, laboratory operation, and diagnostic interpretations of STR genotyping in the context of diagnosis and differential diagnosis of GTD. PubMed was searched for all references up to 2024. EXPERT OPINION STR genotyping is the gold standard in the diagnosis and subclassification of hydatidiform moles and has an important application in diagnostic workup and risk stratifications of gestational trophoblastic tumors as well.
Collapse
Affiliation(s)
| | - Natalia Buza
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - Pei Hui
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
7
|
Haasl RJ, Payseur BA. Fitness landscapes of human microsatellites. PLoS Genet 2024; 20:e1011524. [PMID: 39775235 PMCID: PMC11734926 DOI: 10.1371/journal.pgen.1011524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 01/15/2025] [Accepted: 12/03/2024] [Indexed: 01/11/2025] Open
Abstract
Advances in DNA sequencing technology and computation now enable genome-wide scans for natural selection to be conducted on unprecedented scales. By examining patterns of sequence variation among individuals, biologists are identifying genes and variants that affect fitness. Despite this progress, most population genetic methods for characterizing selection assume that variants mutate in a simple manner and at a low rate. Because these assumptions are violated by repetitive sequences, selection remains uncharacterized for an appreciable percentage of the genome. To meet this challenge, we focus on microsatellites, repetitive variants that mutate orders of magnitude faster than single nucleotide variants, can harbor substantial variation, and are known to influence biological function in some cases. We introduce four general models of natural selection that are each characterized by just two parameters, are easily simulated, and are specifically designed for microsatellites. Using a random forests approach to approximate Bayesian computation, we fit these models to carefully chosen microsatellites genotyped in 200 humans from a diverse collection of eight populations. Altogether, we reconstruct detailed fitness landscapes for 43 microsatellites we classify as targets of selection. Microsatellite fitness surfaces are diverse, including a range of selection strengths, contributions from dominance, and variation in the number and size of optimal alleles. Microsatellites that are subject to selection include loci known to cause trinucleotide expansion disorders and modulate gene expression, as well as intergenic loci with no obvious function. The heterogeneity in fitness landscapes we report suggests that genome-scale analyses like those used to assess selection targeting single nucleotide variants run the risk of oversimplifying the evolutionary dynamics of microsatellites. Moreover, our fitness landscapes provide a valuable visualization of the selective dynamics navigated by microsatellites.
Collapse
Affiliation(s)
- Ryan J. Haasl
- Department of Biology, University of Wisconsin-Platteville, Platteville, Wisconsin, United States of America
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
8
|
Loh CA, Shields DA, Schwing A, Evrony GD. High-fidelity, large-scale targeted profiling of microsatellites. Genome Res 2024; 34:1008-1026. [PMID: 39013593 PMCID: PMC11368184 DOI: 10.1101/gr.278785.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 07/11/2024] [Indexed: 07/18/2024]
Abstract
Microsatellites are highly mutable sequences that can serve as markers for relationships among individuals or cells within a population. The accuracy and resolution of reconstructing these relationships depends on the fidelity of microsatellite profiling and the number of microsatellites profiled. However, current methods for targeted profiling of microsatellites incur significant "stutter" artifacts that interfere with accurate genotyping, and sequencing costs preclude whole-genome microsatellite profiling of a large number of samples. We developed a novel method for accurate and cost-effective targeted profiling of a panel of more than 150,000 microsatellites per sample, along with a computational tool for designing large-scale microsatellite panels. Our method addresses the greatest challenge for microsatellite profiling-"stutter" artifacts-with a low-temperature hybridization capture that significantly reduces these artifacts. We also developed a computational tool for accurate genotyping of the resulting microsatellite sequencing data that uses an ensemble approach integrating three microsatellite genotyping tools, which we optimize by analysis of de novo microsatellite mutations in human trios. Altogether, our suite of experimental and computational tools enables high-fidelity, large-scale profiling of microsatellites, which may find utility in diverse applications such as lineage tracing, population genetics, ecology, and forensics.
Collapse
Affiliation(s)
- Caitlin A Loh
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Danielle A Shields
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Adam Schwing
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| | - Gilad D Evrony
- Center for Human Genetics and Genomics, New York University Grossman School of Medicine, New York, New York 10016, USA;
- Department of Pediatrics, Department of Neuroscience & Physiology, Institute for Systems Genetics, Perlmutter Cancer Center, and Neuroscience Institute, New York University Grossman School of Medicine, New York, New York 10016, USA
| |
Collapse
|
9
|
Goldberg ME, Noyes MD, Eichler EE, Quinlan AR, Harris K. Effects of parental age and polymer composition on short tandem repeat de novo mutation rates. Genetics 2024; 226:iyae013. [PMID: 38298127 PMCID: PMC10990422 DOI: 10.1093/genetics/iyae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 08/11/2023] [Accepted: 01/05/2024] [Indexed: 02/02/2024] Open
Abstract
Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than polymerase slippage in replicating progenitor cells. These results echo the recent finding that DNA damage in oocytes is a significant source of de novo single nucleotide variants and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to known hotspots of oocyte mutagenesis, nor are postzygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on de novo mutation (DNM) rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at G/C-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and contradict prior attribution of replication slippage as the primary mechanism of STR mutagenesis.
Collapse
Affiliation(s)
- Michael E Goldberg
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Departments of Human Genetics and Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Aaron R Quinlan
- Departments of Human Genetics and Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, USA
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Computational Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| |
Collapse
|
10
|
McComish BJ, Charleston MA, Parks M, Baroni C, Salvatore MC, Li R, Zhang G, Millar CD, Holland BR, Lambert DM. Ancient and Modern Genomes Reveal Microsatellites Maintain a Dynamic Equilibrium Through Deep Time. Genome Biol Evol 2024; 16:evae017. [PMID: 38412309 PMCID: PMC10972684 DOI: 10.1093/gbe/evae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 12/22/2023] [Accepted: 01/23/2024] [Indexed: 02/29/2024] Open
Abstract
Microsatellites are widely used in population genetics, but their evolutionary dynamics remain poorly understood. It is unclear whether microsatellite loci drift in length over time. This is important because the mutation processes that underlie these important genetic markers are central to the evolutionary models that employ microsatellites. We identify more than 27 million microsatellites using a novel and unique dataset of modern and ancient Adélie penguin genomes along with data from 63 published chordate genomes. We investigate microsatellite evolutionary dynamics over 2 timescales: one based on Adélie penguin samples dating to ∼46.5 ka and the other dating to the diversification of chordates aged more than 500 Ma. We show that the process of microsatellite allele length evolution is at dynamic equilibrium; while there is length polymorphism among individuals, the length distribution for a given locus remains stable. Many microsatellites persist over very long timescales, particularly in exons and regulatory sequences. These often retain length variability, suggesting that they may play a role in maintaining phenotypic variation within populations.
Collapse
Affiliation(s)
- Bennet J McComish
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS 7001, Australia
| | | | - Matthew Parks
- Australian Research Centre for Human Evolution, Griffith University, Nathan, QLD 4111, Australia
- Department of Biology, University of Central Oklahoma, Edmond, OK 73034, USA
| | - Carlo Baroni
- Dipartimento di Scienze della Terra, University of Pisa, Pisa, Italy
- CNR-IGG, Institute of Geosciences and Earth Resources, Pisa, Italy
| | - Maria Cristina Salvatore
- Dipartimento di Scienze della Terra, University of Pisa, Pisa, Italy
- CNR-IGG, Institute of Geosciences and Earth Resources, Pisa, Italy
| | - Ruiqiang Li
- Novogene Bioinformatics Technology Co. Ltd., Beijing 100083, China
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
- Department of Biology, Centre for Social Evolution, University of Copenhagen, Copenhagen DK-2100, Denmark
| | - Craig D Millar
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| | - David M Lambert
- Australian Research Centre for Human Evolution, Griffith University, Nathan, QLD 4111, Australia
| |
Collapse
|
11
|
Antão-Sousa S, Gusmão L, Modesti NM, Feliziani S, Faustino M, Marcucci V, Sarapura C, Ribeiro J, Carvalho E, Pereira V, Tomas C, de Pancorbo MM, Baeta M, Alghafri R, Almheiri R, Builes JJ, Gouveia N, Burgos G, Pontes MDL, Ibarra A, da Silva CV, Parveen R, Benitez M, Amorim A, Pinto N. Microsatellites' mutation modeling through the analysis of the Y-chromosomal transmission: Results of a GHEP-ISFG collaborative study. Forensic Sci Int Genet 2024; 69:102999. [PMID: 38181588 DOI: 10.1016/j.fsigen.2023.102999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 10/25/2023] [Accepted: 12/10/2023] [Indexed: 01/07/2024]
Abstract
The Spanish and Portuguese Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG) organized a collaborative study on mutations of Y-chromosomal short tandem repeats (Y-STRs). New data from 2225 father-son duos and data from 44 previously published reports, corresponding to 25,729 duos, were collected and analyzed. Marker-specific mutation rates were estimated for 33 Y-STRs. Although highly dependent on the analyzed marker, mutations compatible with the gain or loss of a single repeat were 23.2 times more likely than those involving a greater number of repeats. Longer alleles (relatively to the modal one) showed to be nearly twice more mutable than the shorter ones. Within the subset of longer alleles, the loss of repeats showed to be nearly twice more likely than the gain. Conversely, shorter alleles showed a symmetrical trend, with repeat gains being twofold more frequent than reductions. A positive correlation between the paternal age and the mutation rate was observed, strengthening previous findings. The results of a machine learning approach, via logistic regression analyses, allowed the establishment of algebraic formulas for estimating the probability of mutation depending on paternal age and allele length for DYS389I, DYS393 and DYS627. Algebraic formulas could also be established considering only the allele length as predictor for DYS19, DYS389I, DYS389II-I, DYS390, DYS391, DYS393, DYS437, DYS439, DYS449, DYS456, DYS458, DYS460, DYS481, DYS518, DYS533, DYS576, DYS626 and DYS627 loci. For the remaining Y-STRs, a lack of statistical significance was observed, probably as a consequence of the small effective size of the subsets available, a common difficulty in the modeling of rare events as is the case of mutations. The amount of data used in the different analyses varied widely, depending on how the data were reported in the publications analyzed. This shows a regrettable waste of produced data, due to inadequate communication of the results, supporting an urgent need of publication guidelines for mutation studies.
Collapse
Affiliation(s)
- Sofia Antão-Sousa
- Instituto de Investigação e Inovação em Saúde (i3S), Porto, Portugal; Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), Porto, Portugal; Faculty of Sciences of the University of Porto (FCUP), Porto, Portugal; DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Leonor Gusmão
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Nidia M Modesti
- Centro de Genética Forense, Poder Judicial de Córdoba, Argentina
| | - Sofía Feliziani
- Centro de Genética Forense, Poder Judicial de Córdoba, Argentina
| | - Marisa Faustino
- Instituto de Investigação e Inovação em Saúde (i3S), Porto, Portugal; Faculty of Sciences of the University of Porto (FCUP), Porto, Portugal
| | - Valeria Marcucci
- Laboratorio Regional de Investigación Forense, Tribunal Superior de Justicia de Santa Cruz, Argentina
| | - Claudia Sarapura
- Laboratorio Regional de Investigación Forense, Tribunal Superior de Justicia de Santa Cruz, Argentina
| | - Julyana Ribeiro
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Elizeu Carvalho
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| | - Vania Pereira
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
| | - Carmen Tomas
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
| | - Marian M de Pancorbo
- BIOMICs Research Group, Lascaray Research Center, Department of Zoology and Animal Cell Biology, University of the Basque Country UPV/EHU, Vitoria-Gasteiz, Spain
| | - Miriam Baeta
- BIOMICs Research Group, Lascaray Research Center, Department of Zoology and Animal Cell Biology, University of the Basque Country UPV/EHU, Vitoria-Gasteiz, Spain
| | - Rashed Alghafri
- International Center for Forensic Sciences, Dubai Police G.H.Q., Dubai, United Arab Emirates
| | - Reem Almheiri
- International Center for Forensic Sciences, Dubai Police G.H.Q., Dubai, United Arab Emirates
| | - Juan José Builes
- GENES SAS Laboratory, Medellín, Colombia; Institute of Biology, University of Antioquia, Medellín, Colombia
| | - Nair Gouveia
- Instituto Nacional de Medicina Legal e Ciências Forenses, I.P. / Serviço de Genética e Biologia Forenses, Delegação do Centro, Portugal
| | - German Burgos
- One Health Global Research Group, Facultad de Medicina, Universidad de Las Américas (UDLA), Quito, Ecuador; Grupo de Medicina Xenómica, Universidad de Santiago de Compostela, Santiago de Compostela, Spain
| | - Maria de Lurdes Pontes
- Instituto Nacional de Medicina Legal e Ciências Forenses, I.P. / Serviço de Genética e Biologia Forenses, Delegação do Norte, Portugal
| | - Adriana Ibarra
- Laboratorio IDENTIGEN, Universidad de Antioquia, Colombia
| | - Claudia Vieira da Silva
- Instituto Nacional de Medicina Legal e Ciências Forenses, I.P. / Serviço de Genética e Biologia Forenses, Delegação do Sul, Portugal
| | - Rukhsana Parveen
- Forensic Services Laboratory, Centre for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
| | - Marc Benitez
- Policia de la Generalitat de Catalunya - Mossos d'Esquadra. Unitat Central del Laboratori Biològic, Barcelona, Spain
| | - António Amorim
- Instituto de Investigação e Inovação em Saúde (i3S), Porto, Portugal; Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), Porto, Portugal; Faculty of Sciences of the University of Porto (FCUP), Porto, Portugal
| | - Nadia Pinto
- Instituto de Investigação e Inovação em Saúde (i3S), Porto, Portugal; Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), Porto, Portugal; Centre of Mathematics of the University of Porto, Porto, Portugal.
| |
Collapse
|
12
|
Marchese D, Guislain F, Pringels T, Bridoux L, Rezsohazy R. A poly-histidine motif of HOXA1 is involved in regulatory interactions with cysteine-rich proteins. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2024; 1867:194993. [PMID: 37952572 DOI: 10.1016/j.bbagrm.2023.194993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 10/05/2023] [Accepted: 11/02/2023] [Indexed: 11/14/2023]
Abstract
Homopolymeric amino acid repeats are found in about 24 % of human proteins and are over-represented in transcriptions factors and kinases. Although relatively rare, homopolymeric histidine repeats (polyH) are more significantly found in proteins involved in the regulation of embryonic development. To gain a better understanding of the role of polyH in these proteins, we used a bioinformatic approach to search for shared features in the interactomes of polyH-containing proteins in human. Our analysis revealed that polyH protein interactomes are enriched in cysteine-rich proteins and in proteins containing (a) cysteine repeat(s). Focusing on HOXA1, a HOX transcription factor displaying one long polyH motif, we identified that the polyH motif is required for the HOXA1 interaction with such cysteine-rich proteins. We observed a correlation between the length of the polyH repeat and the strength of the HOXA1 interaction with one Cys-rich protein, MDFI. We also found that metal ion chelators disrupt the HOXA1-MDFI interaction supporting that such metal ions are required for the interaction. Furthermore, we identified three polyH interactors which down-regulate the transcriptional activity of HOXA1. Taken together, our data point towards the involvement of polyH and cysteines in regulatory interactions between proteins, notably transcription factors like HOXA1.
Collapse
Affiliation(s)
- Damien Marchese
- Louvain Institute of Biomolecular Science and Technology, UCLouvain, Place Croix du Sud 5 (L7.07.10), B-1348 Louvain-la-Neuve, Belgium
| | - Florent Guislain
- Louvain Institute of Biomolecular Science and Technology, UCLouvain, Place Croix du Sud 5 (L7.07.10), B-1348 Louvain-la-Neuve, Belgium
| | - Tamara Pringels
- Louvain Institute of Biomolecular Science and Technology, UCLouvain, Place Croix du Sud 5 (L7.07.10), B-1348 Louvain-la-Neuve, Belgium
| | - Laure Bridoux
- Louvain Institute of Biomolecular Science and Technology, UCLouvain, Place Croix du Sud 5 (L7.07.10), B-1348 Louvain-la-Neuve, Belgium
| | - René Rezsohazy
- Louvain Institute of Biomolecular Science and Technology, UCLouvain, Place Croix du Sud 5 (L7.07.10), B-1348 Louvain-la-Neuve, Belgium.
| |
Collapse
|
13
|
Lu J, Toro C, Adams DR, Moreno CAM, Lee WP, Leung YY, Harms MB, Vardarajan B, Heinzen EL. LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants. BMC Genomics 2024; 25:115. [PMID: 38279154 PMCID: PMC10811831 DOI: 10.1186/s12864-023-09935-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 12/21/2023] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND Short tandem repeats (STRs) are widely distributed across the human genome and are associated with numerous neurological disorders. However, the extent that STRs contribute to disease is likely under-estimated because of the challenges calling these variants in short read next generation sequencing data. Several computational tools have been developed for STR variant calling, but none fully address all of the complexities associated with this variant class. RESULTS Here we introduce LUSTR which is designed to address some of the challenges associated with STR variant calling by enabling more flexibility in defining STR loci, allowing for customizable modules to tailor analyses, and expanding the capability to call somatic and multiallelic STR variants. LUSTR is a user-friendly and easily customizable tool for targeted or unbiased genome-wide STR variant screening that can use either predefined or novel genome builds. Using both simulated and real data sets, we demonstrated that LUSTR accurately infers germline and somatic STR expansions in individuals with and without diseases. CONCLUSIONS LUSTR offers a powerful and user-friendly approach that allows for the identification of STR variants and can facilitate more comprehensive studies evaluating the role of pathogenic STR variants across human diseases.
Collapse
Affiliation(s)
- Jinfeng Lu
- Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- The Taub Institute for Research On Alzheimer's Disease and the Aging Brain, Gertrude H. Sergievsky Center, Department of Neurology, College of Physicians and Surgeons, Columbia University, The New York Presbyterian Hospital, New York, NY, 10032, USA.
| | - Camilo Toro
- NIH Undiagnosed Diseases Program, National Human Genome Research Institute (NHGRI), National Institutes of Health, Bethesda, MD, 20892, USA
| | - David R Adams
- NIH Undiagnosed Diseases Program, National Human Genome Research Institute (NHGRI), National Institutes of Health, Bethesda, MD, 20892, USA
| | | | - Wan-Ping Lee
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory MedicinePerelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yuk Yee Leung
- Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory MedicinePerelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Mathew B Harms
- Department of Neurology, Division of Neuromuscular Medicine, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Badri Vardarajan
- The Taub Institute for Research On Alzheimer's Disease and the Aging Brain, Gertrude H. Sergievsky Center, Department of Neurology, College of Physicians and Surgeons, Columbia University, The New York Presbyterian Hospital, New York, NY, 10032, USA
| | - Erin L Heinzen
- Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
14
|
Goldberg ME, Noyes MD, Eichler EE, Quinlan AR, Harris K. Effects of parental age and polymer composition on short tandem repeat de novo mutation rates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.22.573131. [PMID: 38187618 PMCID: PMC10769404 DOI: 10.1101/2023.12.22.573131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than the classical mechanism of polymerase slippage in replicating progenitor cells. These results also echo the recent finding that DNA damage in quiescent oocytes is a significant source of de novo SNVs and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to previously discovered hotspots of oocyte mutagenesis, nor are post-zygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on DNM rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at GC-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and are especially surprising considering the prior belief in replication slippage as the dominant mechanism of STR mutagenesis.
Collapse
Affiliation(s)
- Michael E. Goldberg
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Departments of Human Genetics and Biomedical Informatics, University of Utah, 15 S 2030 E, Salt Lake City, UT, 84112
| | - Michelle D. Noyes
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Howard Hughes Medical Institute, 3720 15 Ave NE, University of Washington, Seattle, WA, 98195
| | - Aaron R. Quinlan
- Departments of Human Genetics and Biomedical Informatics, University of Utah, 15 S 2030 E, Salt Lake City, UT, 84112
- These authors contributed equally to this work
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, 3720 15 Ave NE, Seattle, WA, 98195
- Computational Biology Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109
- These authors contributed equally to this work
| |
Collapse
|
15
|
Ziaei Jam H, Li Y, DeVito R, Mousavi N, Ma N, Lujumba I, Adam Y, Maksimov M, Huang B, Dolzhenko E, Qiu Y, Kakembo FE, Joseph H, Onyido B, Adeyemi J, Bakhtiari M, Park J, Javadzadeh S, Jjingo D, Adebiyi E, Bafna V, Gymrek M. A deep population reference panel of tandem repeat variation. Nat Commun 2023; 14:6711. [PMID: 37872149 PMCID: PMC10593948 DOI: 10.1038/s41467-023-42278-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 10/05/2023] [Indexed: 10/25/2023] Open
Abstract
Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Collapse
Affiliation(s)
- Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Yang Li
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Ross DeVito
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA, USA
| | - Nichole Ma
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Ibra Lujumba
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mikhail Maksimov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Bonnie Huang
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | | | - Yunjiang Qiu
- Illumina Incorporated, San Diego, CA, 92122, USA
| | - Fredrick Elishama Kakembo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Habi Joseph
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
| | - Blessing Onyido
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Jumoke Adeyemi
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mehrdad Bakhtiari
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Sara Javadzadeh
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Daudi Jjingo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala, Uganda
- Department of Computer Science, Makerere University, Kampala, Uganda
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, Baden-Württemberg, 69120, Germany
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
16
|
Ichikawa K, Kawahara R, Asano T, Morishita S. A landscape of complex tandem repeats within individual human genomes. Nat Commun 2023; 14:5530. [PMID: 37709751 PMCID: PMC10502081 DOI: 10.1038/s41467-023-41262-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 08/28/2023] [Indexed: 09/16/2023] Open
Abstract
Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability. However, haplotype-resolved long TRs remain mostly hidden or blacked out because their complex structures (TRs composed of various units and minisatellites containing >10-bp units) make them difficult to determine accurately with existing methods. Here, using a high-precision algorithm to determine complex TR structures from long, accurate reads of PacBio HiFi, an investigation of 270 Japanese control samples yields several genome-wide findings. Approximately 322,000 TRs are difficult to impute from the surrounding single-nucleotide variants. Greater genetic divergence of TR loci is significantly correlated with more events of younger replication slippage. Complex TRs are more abundant than single-unit TRs, and a tendency for complex TRs to consist of <10-bp units and single-unit TRs to be minisatellites is statistically significant at loci with ≥500-bp TRs. Of note, 8909 loci with extended TRs (>100b longer than the mode) contain several known disease-associated TRs and are considered candidates for association with disorders. Overall, complex TRs and minisatellites are found to be abundant and diverse, even in genetically small Japanese populations, yielding insights into the landscape of long TRs.
Collapse
Affiliation(s)
- Kazuki Ichikawa
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 277-8561, Chiba, Japan
| | - Riki Kawahara
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 277-8561, Chiba, Japan
| | - Takeshi Asano
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 277-8561, Chiba, Japan
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 277-8561, Chiba, Japan.
| |
Collapse
|
17
|
Wang N, Khan S, Elo LL. VarSCAT: A computational tool for sequence context annotations of genomic variants. PLoS Comput Biol 2023; 19:e1010727. [PMID: 37566612 PMCID: PMC10446208 DOI: 10.1371/journal.pcbi.1010727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 08/23/2023] [Accepted: 07/20/2023] [Indexed: 08/13/2023] Open
Abstract
The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT.
Collapse
Affiliation(s)
- Ning Wang
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
| | - Sofia Khan
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Laura L. Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
- Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
18
|
Jam HZ, Li Y, DeVito R, Mousavi N, Ma N, Lujumba I, Adam Y, Maksimov M, Huang B, Dolzhenko E, Qiu Y, Kakembo FE, Joseph H, Onyido B, Adeyemi J, Bakhtiari M, Park J, Javadzadeh S, Jjingo D, Adebiyi E, Bafna V, Gymrek M. A deep population reference panel of tandem repeat variation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.09.531600. [PMID: 36945429 PMCID: PMC10028971 DOI: 10.1101/2023.03.09.531600] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
Abstract
Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3,550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Collapse
Affiliation(s)
- Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Yang Li
- Department of Medicine, University of California San Diego, La Jolla, CA
| | - Ross DeVito
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA
| | - Nichole Ma
- Department of Medicine, University of California San Diego, La Jolla, CA
| | - Ibra Lujumba
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala-Uganda
| | - Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mikhail Maksimov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Bonnie Huang
- Department of Bioengineering, University of California San Diego, La Jolla, CA
| | | | - Yunjiang Qiu
- Illumina Incorporated, San Diego, California 92122, USA
| | - Fredrick Elishama Kakembo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala-Uganda
| | - Habi Joseph
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala-Uganda
| | - Blessing Onyido
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Jumoke Adeyemi
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
| | - Mehrdad Bakhtiari
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Sara Javadzadeh
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Daudi Jjingo
- The African Center of Excellence in Bioinformatics and Data Intensive Sciences, the Infectious Diseases Institute, Makerere University, Kampala-Uganda
- Department of Computer Science, Makerere University, Kampala, Uganda
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun, 112233, Nigeria
- Department of Computer & Information Sciences, Covenant University, Ota, Ogun, 112233, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun, 112233, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, Baden-Württemberg, 69120, Germany
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA
- Department of Medicine, University of California San Diego, La Jolla, CA
| |
Collapse
|
19
|
Using unique molecular identifiers to improve allele calling in low-template mixtures. Forensic Sci Int Genet 2023; 63:102807. [PMID: 36462297 DOI: 10.1016/j.fsigen.2022.102807] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 10/20/2022] [Accepted: 11/18/2022] [Indexed: 11/27/2022]
Abstract
PCR artifacts are an ever-present challenge in sequencing applications. These artifacts can seriously limit the analysis and interpretation of low-template samples and mixtures, especially with respect to a minor contributor. In medicine, molecular barcoding techniques have been employed to decrease the impact of PCR error and to allow the examination of low-abundance somatic variation. In principle, it should be possible to apply the same techniques to the forensic analysis of mixtures. To that end, several short tandem repeat loci were selected for targeted sequencing, and a bioinformatic pipeline for analyzing the sequence data was developed. The pipeline notes the relevant unique molecular identifiers (UMIs) attached to each read and, using machine learning, filters the noise products out of the set of potential alleles. To evaluate this pipeline, DNA from pairs of individuals were mixed at different ratios (1-1, 1-9) and sequenced with different starting amounts of DNA (10, 1 and 0.1 ng). Naïvely using the information in the molecular barcodes led to increased performance, with the machine learning resulting in an additional benefit. In concrete terms, using the UMI data results in less noise for a given amount of drop out. For instance, if thresholds are selected that filter out a quarter of the true alleles, using read counts accepts 2381 noise alleles and using raw UMI counts accepts 1726 noise alleles, while the machine learning approach only accepts 307.
Collapse
|
20
|
Tseng SP, Darras H, Hsu PW, Yoshimura T, Lee CY, Wetterer JK, Keller L, Yang CCS. Genetic analysis reveals the putative native range and widespread double-clonal reproduction in the invasive longhorn crazy ant. Mol Ecol 2023; 32:1020-1033. [PMID: 36527320 DOI: 10.1111/mec.16827] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 12/12/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022]
Abstract
Clonal reproduction can provide an advantage for invasive species to establish as it can circumvent inbreeding depression which often plagues introduced populations. The world's most widespread invasive ant, Paratrechina longicornis, was previously found to display a double-clonal reproduction system, whereby both males and queens are produced clonally, resulting in separate male and queen lineages, while workers are produced sexually. Under this unusual reproduction mode, inbreeding is avoided in workers as they carry hybrid interlineage genomes. Despite the ubiquitous distribution of P. longicornis, the significance of this reproductive system for the ant's remarkable success remains unclear, as its prevalence is still unknown. Further investigation into the controversial native origin of P. longicornis is also required to reconstruct the evolutionary histories of double-clonal lineages. Here, we examine genetic variation and characterize the reproduction mode of P. longicornis populations sampled worldwide using microsatellites and mitochondrial DNA sequences to infer the ant's putative native range and the distribution of the double-clonal reproductive system. Analyses of global genetic variations indicate that the Indian subcontinent is a genetic diversity hotspot of this species, suggesting that P. longicornis probably originates from this geographical area. Our analyses revealed that both the inferred native and introduced populations exhibit double-clonal reproduction, with queens and males around the globe belonging to two separate, nonrecombining clonal lineages. By contrast, workers are highly heterozygous because they are first-generation interlineage hybrids. Overall, these data indicate a worldwide prevalence of double clonality in P. longicornis and support the prediction that the unusual genetic system may have pre-adapted this ant for global colonization by maintaining heterozygosity in the worker force and alleviating genetic bottlenecks.
Collapse
Affiliation(s)
- Shu-Ping Tseng
- Department of Entomology, National Taiwan University, Taipei, Taiwan.,Department of Entomology, University of California, Riverside, California, USA.,Research Institute for Sustainable Humanosphere, Kyoto University, Kyoto, Japan
| | - Hugo Darras
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Po-Wei Hsu
- Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| | - Tsuyoshi Yoshimura
- Research Institute for Sustainable Humanosphere, Kyoto University, Kyoto, Japan
| | - Chow-Yang Lee
- Department of Entomology, University of California, Riverside, California, USA
| | - James K Wetterer
- Wilkes Honors College, Florida Atlantic University, Jupiter, Florida, USA
| | - Laurent Keller
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Chin-Cheng Scotty Yang
- Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| |
Collapse
|
21
|
Verbiest M, Maksimov M, Jin Y, Anisimova M, Gymrek M, Bilgin Sonay T. Mutation and selection processes regulating short tandem repeats give rise to genetic and phenotypic diversity across species. J Evol Biol 2023; 36:321-336. [PMID: 36289560 PMCID: PMC9990875 DOI: 10.1111/jeb.14106] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 06/29/2022] [Accepted: 08/01/2022] [Indexed: 02/03/2023]
Abstract
Short tandem repeats (STRs) are units of 1-6 bp that repeat in a tandem fashion in DNA. Along with single nucleotide polymorphisms and large structural variations, they are among the major genomic variants underlying genetic, and likely phenotypic, divergence. STRs experience mutation rates that are orders of magnitude higher than other well-studied genotypic variants. Frequent copy number changes result in a wide range of alleles, and provide unique opportunities for modulating complex phenotypes through variation in repeat length. While classical studies have identified key roles of individual STR loci, the advent of improved sequencing technology, high-quality genome assemblies for diverse species, and bioinformatics methods for genome-wide STR analysis now enable more systematic study of STR variation across wide evolutionary ranges. In this review, we explore mutation and selection processes that affect STR copy number evolution, and how these processes give rise to varying STR patterns both within and across species. Finally, we review recent examples of functional and adaptive changes linked to STRs.
Collapse
Affiliation(s)
- Max Verbiest
- Institute of Computational Life Sciences, School of Life Sciences and Facility ManagementZürich University of Applied SciencesWädenswilSwitzerland
- Department of Molecular Life SciencesUniversity of ZurichZurichSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Mikhail Maksimov
- Department of Computer Science & EngineeringUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Ye Jin
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of BioengineeringUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Maria Anisimova
- Institute of Computational Life Sciences, School of Life Sciences and Facility ManagementZürich University of Applied SciencesWädenswilSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Melissa Gymrek
- Department of Computer Science & EngineeringUniversity of California San DiegoLa JollaCaliforniaUSA
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Tugce Bilgin Sonay
- Institute of Ecology, Evolution and Environmental BiologyColumbia UniversityNew YorkNew YorkUSA
| |
Collapse
|
22
|
Pinzari CA, Bellinger MR, Price D, Bonaccorso FJ. Genetic diversity, structure, and effective population size of an endangered, endemic hoary bat, 'ōpe'ape'a, across the Hawaiian Islands. PeerJ 2023; 11:e14365. [PMID: 36718450 PMCID: PMC9884036 DOI: 10.7717/peerj.14365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 10/19/2022] [Indexed: 01/26/2023] Open
Abstract
Island bat species are disproportionately at risk of extinction, and Hawai'i's only native terrestrial land mammal, the Hawaiian hoary bat (Lasiurus semotus) locally known as 'ōpe'ape'a, is no exception. To effectively manage this bat species with an archipelago-wide distribution, it is important to determine the population size on each island and connectivity between islands. We used 18 nuclear microsatellite loci and one mitochondrial gene from 339 individuals collected from 1988-2020 to evaluate genetic diversity, population structure and estimate effective population size on the Islands of Hawai'i, Maui, O'ahu, and Kaua'i. Genetic differentiation occurred between Hawai'i and Maui, both of which were differentiated from O'ahu and Kaua'i. The population on Maui presents the greatest per-island genetic diversity, consistent with their hypothesized status as the original founding population. A signature of isolation by distance was detected between islands, with contemporary migration analyses indicating limited gene flow in recent generations, and male-biased sex dispersal within Maui. Historical and long-term estimates of genetic effective population sizes were generally larger than contemporary estimates, although estimates of contemporary genetic effective population size lacked upper bounds in confidence intervals for Hawai'i and Kaua'i. Contemporary genetic effective population sizes were smaller on O'ahu and Maui. We also detected evidence of past bottlenecks on all islands with the exception of Hawai'i. Our study provides population-level estimates for the genetic diversity and geographic structure of 'ōpe'ape'a, that could be used by agencies tasked with wildlife conservation in Hawai'i.
Collapse
Affiliation(s)
- Corinna A. Pinzari
- Tropical Conservation Biology and Environmental Science Graduate Program, University of Hawaiʻi at Hilo, Hilo, Hawaiʻi, United States of America,Hawaiʻi Cooperative Studies Unit, University of Hawaiʻi at Hilo, Hawaiʻi National Park, Hawaiʻi, United States of America
| | - M. Renee Bellinger
- Tropical Conservation Biology and Environmental Science Graduate Program, University of Hawaiʻi at Hilo, Hilo, Hawaiʻi, United States of America,Hawaiʻi Cooperative Studies Unit, University of Hawaiʻi at Hilo, Hawaiʻi National Park, Hawaiʻi, United States of America,Pacific Island Ecosystems Research Center, U.S. Geological Survey, Hawaiʻi National Park, Hawaiʻi, United States of America
| | - Donald Price
- Tropical Conservation Biology and Environmental Science Graduate Program, University of Hawaiʻi at Hilo, Hilo, Hawaiʻi, United States of America,School of Life Sciences, University of Nevada - Las Vegas, Las Vegas, NV, United States of America
| | - Frank J. Bonaccorso
- Pacific Island Ecosystems Research Center, U.S. Geological Survey, Hawaiʻi National Park, Hawaiʻi, United States of America
| |
Collapse
|
23
|
Huang L, Feng G, Li D, Shang W, Zhang L, Yan R, Jiang Y, Li S. Genetic variation of endangered Jankowski’s Bunting (Emberiza jankowskii): High connectivity and a moderate history of demographic decline. Front Ecol Evol 2023. [DOI: 10.3389/fevo.2022.996617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
IntroductionContinued discovery of “mismatch” patterns between population size and genetic diversity, involving wild species such as insects, amphibians, birds, mammals, and others, has raised issues about how population history, especially recent dynamics under human disturbance, affects currently standing genetic variation. Previous studies have revealed high genetic diversity in endangered Jankowski’s Bunting. However, it is unclear how the demographic history and recent habitat changes shape the genetic variation of Jankowski’s Bunting.MethodsTo explore the formation and maintenance of high genetic diversity in endangered Jankowski’s Bunting, we used a mitochondrial control region (partial mtDNA CR) and 15 nuclear microsatellite markers to explore the recent demographic history of Jankowski’s Bunting, and we compared the historical and contemporary gene flows between populations to reveal the impact of habitat change on population connectivity. Specifically, we aimed to test the following hypotheses: (1) Jankowski’s Bunting has a large historical Ne and a moderate demographic history; and (2) recent habitat change might have no significant impact on the species’ population connectivity.ResultsThe results suggested that large historical effective population size, as well as severe but slow population decline, may partially explain the high observable genetic diversity. Comparison of historical (over the past 4Ne generations) and contemporary (1–3 generations) gene flow indicated that the connectivity between five local populations was only marginally affected by landscape changes.DiscussionOur results suggest that high population connectivity and a moderate history of demographic decline are powerful explanations for the rich genetic variation in Jankowski’s Bunting. Although there is no evidence that the genetic health of Jankowski’s Bunting is threatened, the time-lag effects on the genetic response to recent environmental changes is a reminder to be cautious about the current genetic characteristics of this species. Where possible, factors influencing genetic variation should be integrated into a systematic framework for conducting robust population health assessments. Given the small contemporary population size, inbreeding, and ecological specialization, we recommend that habitat protection be maintained to maximize the genetic diversity and population connectivity of Jankowski’s Bunting.
Collapse
|
24
|
Steely CJ, Watkins WS, Baird L, Jorde LB. The mutational dynamics of short tandem repeats in large, multigenerational families. Genome Biol 2022; 23:253. [PMID: 36510265 PMCID: PMC9743774 DOI: 10.1186/s13059-022-02818-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 11/17/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short tandem repeats (STRs) compose approximately 3% of the genome, and mutations at STR loci have been linked to dozens of human diseases including amyotrophic lateral sclerosis, Friedreich ataxia, Huntington disease, and fragile X syndrome. Improving our understanding of these mutations would increase our knowledge of the mutational dynamics of the genome and may uncover additional loci that contribute to disease. To estimate the genome-wide pattern of mutations at STR loci, we analyze blood-derived whole-genome sequencing data for 544 individuals from 29 three-generation CEPH pedigrees. These pedigrees contain both sets of grandparents, the parents, and an average of 9 grandchildren per family. RESULTS We use HipSTR to identify de novo STR mutations in the 2nd generation of these pedigrees and require transmission to the third generation for validation. Analyzing approximately 1.6 million STR loci, we estimate the empirical de novo STR mutation rate to be 5.24 × 10-5 mutations per locus per generation. Perfect repeats mutate about 2 × more often than imperfect repeats. De novo STRs are significantly enriched in Alu elements. CONCLUSIONS Approximately 30% of new STR mutations occur within Alu elements, which compose only 11% of the genome, but only 10% are found in LINE-1 insertions, which compose 17% of the genome. Phasing these mutations to the parent of origin shows that parental transmission biases vary among families. We estimate the average number of de novo genome-wide STR mutations per individual to be approximately 85, which is similar to the average number of observed de novo single nucleotide variants.
Collapse
Affiliation(s)
- Cody J. Steely
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - W. Scott Watkins
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Lisa Baird
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Lynn B. Jorde
- grid.223827.e0000 0001 2193 0096Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| |
Collapse
|
25
|
Swain SK, Sahu BP, Das SP, Sahoo L, Das PC, Das P. Population genetic structure of fringe-lipped carp, Labeo fimbriatus from the peninsular rivers of India. 3 Biotech 2022; 12:300. [PMID: 36276442 PMCID: PMC9525529 DOI: 10.1007/s13205-022-03369-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 09/17/2022] [Indexed: 11/01/2022] Open
Abstract
Labeo fimbriatus is a medium carp species found throughout India's peninsular river basins and is regarded as a valuable aquaculture resource alongside Indian major carps due to its taste and nutritional value. This species has recently declined dramatically due to habitat degradation and overfishing. Because of its enormous economic importance, a selective breeding programme is likely to be in place to improve performance traits. Knowledge of genetic variation among the base population from which the broodstock will be selected is an important step in this process. A diverse genetic base of broodstock is required to achieve the best response to selection for long-term aquaculture management practices. Consequently, using mitochondrial DNA (ATPase 6 and Control region) and microsatellite markers, we have made the first step toward estimating the level of genetic variation and how it is distributed among the four populations of L. fimbriatus found in peninsular rivers in India. The ATPase 6 gene analysis in four populations revealed 15 haplotypes and 51 variable sites, in contrast to the Control region, which had 60 haplotypes together with 73 variable sites and a haplotype diversity of 0.941. Twelve microsatellite loci displayed estimated allele numbers (N A) ranging from 3 to 19, observed heterozygosity (H O), and expected heterozygosity (H E), respectively, of 0.705 to 0.753 and 0.657 to 0.914. Each marker type showed a significant F ST value, indicating the presence of low to moderate genetic differentiation across entire wild populations. The Godavari, Kaveri, and Mahanadi populations formed one cluster according to the UPGMA, which was based on genetic distance matrix, while the Krishna population formed a separate cluster. The comparative genetic analysis of data from different markers utilized in the current study would enable the identification of the genetic stocks of L. fimbriatus and facilitate conservation measures and selective breeding. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03369-y.
Collapse
Affiliation(s)
- Subrat Kumar Swain
- Medical Research Laboratory, IMS and SUM Hospital, SOA University, K8, Kalinga Nagar, Bhubaneswar, India
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
| | - Basanta Pravas Sahu
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
- School of Biological Science, The University of Hong Kong, Pokfulam, Hong Kong
| | - Sofia Priyadarsani Das
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
- Amity Institute of Marine Science and Technology, Amity University Uttar Pradesh, Sector-125, Noida, India
| | - Lakshman Sahoo
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
| | - Pratap Chandra Das
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
| | - Paramananda Das
- Fish Genetics and Biotechnology Division, ICAR-Central Institute of Freshwater Aquaculture, Kausalyaganga, Bhubaneswar, Odisha 751002 India
| |
Collapse
|
26
|
Antão-Sousa S, Conde-Sousa E, Gusmão L, Amorim A, Pinto N. Estimations of Mutation Rates Depend on Population Allele Frequency Distribution: The Case of Autosomal Microsatellites. Genes (Basel) 2022; 13:genes13071248. [PMID: 35886031 PMCID: PMC9323320 DOI: 10.3390/genes13071248] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/28/2022] [Accepted: 07/11/2022] [Indexed: 01/27/2023] Open
Abstract
Microsatellites (or short-tandem repeats (STRs)) are widely used in anthropology and evolutionary studies. Their extensive polymorphism and rapid evolution make them the ideal genetic marker for dating events, such as the age of a gene or a population. This usage requires the estimation of mutation rates, which are usually estimated by counting the observed Mendelian incompatibilities in one-generation familial configurations (typically parent(s)–child duos or trios). Underestimations are inevitable when using this approach, due to the occurrence of mutational events that do not lead to incompatibilities with the parental genotypes (‘hidden’ or ‘covert’ mutations). It is known that the likelihood that one mutation event leads to a Mendelian incompatibility depends on the mode of genetic transmission considered, the type of familial configuration (duos or trios) considered, and the genotype(s) of the progenitor(s). In this work, we show how the magnitude of the underestimation of autosomal microsatellite mutation rates varies with the populations’ allele frequency distribution spectrum. The Mendelian incompatibilities approach (MIA) was applied to simulated parent(s)/offspring duos and trios in different populational scenarios. The results showed that the magnitude and type of biases depend on the population allele frequency distribution, whatever the type of familial data considered, and are greater when duos, instead of trios, are used to obtain the estimates. The implications for molecular anthropology are discussed and a simple framework is presented to correct the naïf estimates, along with an informatics tool for the correction of incompatibility rates obtained through the MIA.
Collapse
Affiliation(s)
- Sofia Antão-Sousa
- Instituto de Investigação e Inovação em Saúde (i3S), 4200-135 Porto, Portugal; (E.C.-S.); (A.A.); (N.P.)
- Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), 4200-465 Porto, Portugal
- Faculty of Sciences, University of Porto (FCUP), 4169-007 Porto, Portugal
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro 20550-013, Brazil;
- Correspondence:
| | - Eduardo Conde-Sousa
- Instituto de Investigação e Inovação em Saúde (i3S), 4200-135 Porto, Portugal; (E.C.-S.); (A.A.); (N.P.)
- Instituto de Engenharia Biomédica (INEB), 4200-135 Porto, Portugal
| | - Leonor Gusmão
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro 20550-013, Brazil;
| | - António Amorim
- Instituto de Investigação e Inovação em Saúde (i3S), 4200-135 Porto, Portugal; (E.C.-S.); (A.A.); (N.P.)
- Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), 4200-465 Porto, Portugal
- Faculty of Sciences, University of Porto (FCUP), 4169-007 Porto, Portugal
| | - Nádia Pinto
- Instituto de Investigação e Inovação em Saúde (i3S), 4200-135 Porto, Portugal; (E.C.-S.); (A.A.); (N.P.)
- Institute of Molecular Pathology and Immunology, University of Porto (IPATIMUP), 4200-465 Porto, Portugal
- Center of Mathematics, University of Porto (CMUP), 4169-007 Porto, Portugal
| |
Collapse
|
27
|
Oh A, Oh B. Genetic differentiation that is exceptionally high and unexpectedly sensitive to geographic distance in the absence of gene flow: Insights from the genus Eranthis in East Asian regions. Ecol Evol 2022; 12:e9007. [PMID: 35784042 PMCID: PMC9173865 DOI: 10.1002/ece3.9007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 02/24/2022] [Accepted: 05/20/2022] [Indexed: 11/10/2022] Open
Abstract
Genetic differentiation between populations is determined by various factors, including gene flow, selection, mutation, and genetic drift. Among these, gene flow is known to counter genetic differentiation. The genus Eranthis, an early flowering perennial herb, can serve as a good model to study genetic differentiation and gene flow due to its easily detectable population characteristics and known reproductive strategies, which can be associated with gene flow patterns. Eranthis populations are typically small and geographically separated from the others. Moreover, previous studies and our own observations suggest that seed and pollen dispersal between Eranthis populations is highly unlikely and therefore, currently, gene flow may not be probable in this genus. Based on these premises, we hypothesized that the genetic differentiation between the Eranthis populations would be significant, and that the genetic differentiation would not sensitively reflect geographic distance in the absence of gene flow. To test these hypotheses, genetic differentiation, genetic distance, isolation by distance, historical gene flow, and bottlenecks were analyzed in four species of this genus. Genetic differentiation was significantly high, and in many cases, extremely high. Moreover, genetic differentiation and geographic distance were positively correlated in most cases. We provide possible explanations for these observations. First, we suggest that the combination of the marker type used in our study (chloroplast microsatellites), genetic drift, and possibly selection might have resulted in the extremely high genetic differentiation observed herein. Additionally, we provide the possibility that genetic distance reflects geographic distance through historical gene flow, or adaptation in the absence of historical gene flow. Nevertheless, our explanations can be more rigorously examined and further refined through additional observations and various population genetic analyses. In particular, we suggest that other accessible populations of the genus Eranthis should be included in future studies to better characterize the intriguing population dynamics of this genus.
Collapse
Affiliation(s)
- Ami Oh
- Department of BiologyChungbuk National UniversityCheongjuChungbukRepublic of Korea
| | - Byoung‐Un Oh
- Department of BiologyChungbuk National UniversityCheongjuChungbukRepublic of Korea
| |
Collapse
|
28
|
Kanyesigye D, Alibu VP, Tay WT, Nalela P, Paparu P, Olaboro S, Nkalubo ST, Kayondo IS, Silva G, Seal SE, Otim MH. Population Genetic Structure of the Bean Leaf Beetle Ootheca mutabilis (Coleoptera: Chrysomelidae) in Uganda. INSECTS 2022; 13:543. [PMID: 35735880 PMCID: PMC9225125 DOI: 10.3390/insects13060543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 05/26/2022] [Accepted: 06/04/2022] [Indexed: 11/16/2022]
Abstract
Bean leaf beetle (BLB) (Ootheca mutabilis) has emerged as an important bean pest in Uganda, leading to devastating crop losses. There is limited information on the population genetic structure of BLB despite its importance. In this study, novel microsatellite DNA markers and the partial mitochondrial cytochrome oxidase subunit I (mtCOI) gene sequences were used to analyze the spatial population genetic structure, genetic differentiation and haplotype diversity of 86 O. mutabilis samples from 16 (districts) populations. We identified 19,356 simple sequence repeats (SSRs) (mono, di-, tri-, tetra-, penta-, and hexa-nucleotides) of which 81 di, tri and tetra-nucleotides were selected for primer synthesis. Five highly polymorphic SSR markers (4-21 alleles, heterozygosity 0.59-0.84, polymorphic information content (PIC) 50.13-83.14%) were used for this study. Analyses of the 16 O. mutabilis populations with these five novel SSRs found nearly all the genetic variation occurring within populations and there was no evidence of genetic differentiation detected for both types of markers. Also, there was no evidence of isolation by distance between geographical and genetic distances for SSR data and mtCOI data except in one agro-ecological zone for mtCOI data. Bayesian clustering identified a signature of admixture that suggests genetic contributions from two hypothetical ancestral genetic lineages for both types of markers, and the minimum-spanning haplotype network showed low differentiation in minor haplotypes from the most common haplotype with the most common haplotype occurring in all the 16 districts. A lack of genetic differentiation indicates unrestricted migrations between populations. This information will contribute to the design of BLB control strategies.
Collapse
Affiliation(s)
- Dalton Kanyesigye
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
- College of Veterinary Medicine, Animal Resources and Biosecurity (CoVAB), Makerere University, Kampala P.O. Box 7062, Uganda
| | - Vincent Pius Alibu
- College of Natural Sciences (CoNAS), Makerere University, Kampala P.O. Box 7062, Uganda;
| | - Wee Tek Tay
- Commonwealth Scientific and Industrial Research Organisation, Canberra, ACT 2601, Australia;
| | - Polycarp Nalela
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
| | - Pamela Paparu
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
| | - Samuel Olaboro
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
| | - Stanley Tamusange Nkalubo
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
| | - Ismail Siraj Kayondo
- International Institute of Tropical Agriculture, PMB 5320, Oyo Rd., Ibadan 20001, Nigeria;
| | - Gonçalo Silva
- Natural Resources Institute, University of Greenwich, Central Avenue, Chatham Maritime, Kent ME4 4TB, UK; (G.S.); (S.E.S.)
| | - Susan E. Seal
- Natural Resources Institute, University of Greenwich, Central Avenue, Chatham Maritime, Kent ME4 4TB, UK; (G.S.); (S.E.S.)
| | - Michael Hilary Otim
- National Agricultural Research Organization (NARO), National Crops Resources Research Institute (NaCRRI), Kampala P.O. Box 7084, Uganda; (D.K.); (P.N.); (P.P.); (S.O.); (S.T.N.)
| |
Collapse
|
29
|
Boldyreva LV, Andreyeva EN, Pindyurin AV. Position Effect Variegation: Role of the Local Chromatin Context in Gene Expression Regulation. Mol Biol 2022. [DOI: 10.1134/s0026893322030049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
30
|
Yadav Y, Sharma SN, Shakya DK. Detection of Tandem Repeats in DNA Sequences Using Short-Time Ramanujan Fourier Transform. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1583-1591. [PMID: 33493119 DOI: 10.1109/tcbb.2021.3053656] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tandem repeats in genomic sequences are characterized by two or more contiguous copies of a pattern of nucleotides. The role of these repeats as molecular markers is well established in various genetic disorders, human evolution studies, DNA forensics and intron retention. In this work a computational method has been developed for the extraction of both exact and approximate tandem repeats. The proposed algorithm uses Ramanujan Fourier Transform (RFT) to identify periodicities in the DNA sequences. Since RFT estimates the period directly, rather than inferring it from the signal's spectrum, it provides a more sensitive and rapid detection of tandem repeats as compared to other available popular computational methods.
Collapse
|
31
|
Huang T, Li J, Wang SM. Etiological roles of core promoter variation in triple-negative breast cancer. Genes Dis 2022; 10:228-238. [PMID: 37013029 PMCID: PMC10066267 DOI: 10.1016/j.gendis.2022.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/26/2021] [Accepted: 01/12/2022] [Indexed: 10/19/2022] Open
Abstract
Abnormal gene expression plays key role in cancer development. A core promoter is located around the transcriptional start site. Through interaction between core promoter sequences and transcriptional factors, core promoter controls transcriptional initiation. We hypothesized that in cancer, core promoter sequences could be mutated to interfere the interaction with transcriptional factors, resulting in altered transcriptional initiation and abnormal gene expression and cancer development. We used triple-negative breast cancer (TNBC) as a model to test our hypothesis. We collected genome-wide core promoter variants from 279 TNBC genomes. After extensive filtering of normal genomic polymorphism, we identified 19,427 recurrent somatic variants in 1,238 core promoters of 1,274 genes and 1,694 recurrent germline variants in 272 core promoters of 294 genes. Many of the affected genes were oncogenes and tumor suppressors. Analysis of RNA-seq data from the same patient cohort identified increased or decreased gene expression in 439 somatic and 85 germline variants-affected genes, and the results were validated by luciferase reporter assay. By comparing with the core promoter variation data from 610 unclassified breast cancer, we observed that core promoter variants in TNBC were highly TNBC-specific. We further identified the drugs targeting the genes with core promoter variation. Our study demonstrates that core promoter is highly mutable in cancer, and can play etiological roles in TNBC and other types of cancer through influencing transcriptional initiation.
Collapse
|
32
|
Annear DJ, Vandeweyer G, Sanchis-Juan A, Raymond FL, Kooy RF. Non-Mendelian inheritance patterns and extreme deviation rates of CGG repeats in autism. Genome Res 2022; 32:1967-1980. [PMID: 36351771 PMCID: PMC9808627 DOI: 10.1101/gr.277011.122] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 10/14/2022] [Indexed: 11/10/2022]
Abstract
As expansions of CGG short tandem repeats (STRs) are established as the genetic etiology of many neurodevelopmental disorders, we aimed to elucidate the inheritance patterns and role of CGG STRs in autism-spectrum disorder (ASD). By genotyping 6063 CGG STR loci in a large cohort of trios and quads with an ASD-affected proband, we determined an unprecedented rate of CGG repeat length deviation across a single generation. Although the concept of repeat length being linked to deviation rate was solidified, we show how shorter STRs display greater degrees of size variation. We observed that CGG STRs did not segregate by Mendelian principles but with a bias against longer repeats, which appeared to magnify as repeat length increased. Through logistic regression, we identified 19 genes that displayed significantly higher rates and degrees of CGG STR expansion within the ASD-affected probands (P < 1 × 10-5). This study not only highlights novel repeat expansions that may play a role in ASD but also reinforces the hypothesis that CGG STRs are specifically linked to human cognition.
Collapse
Affiliation(s)
- Dale J. Annear
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| | - Geert Vandeweyer
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| | - Alba Sanchis-Juan
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, United Kingdom;,Department of Haematology, University of Cambridge, NHS Blood and Transplant Centre, Cambridge, CB2 0PT, United Kingdom
| | - F. Lucy Raymond
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, United Kingdom;,Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, United Kingdom
| | - R. Frank Kooy
- Department of Medical Genetics, University of Antwerp, 2600 Antwerp, Belgium
| |
Collapse
|
33
|
Comparative Analysis of Plasmodium falciparum Genotyping via SNP Detection, Microsatellite Profiling, and Whole-Genome Sequencing. Antimicrob Agents Chemother 2021; 66:e0116321. [PMID: 34694871 PMCID: PMC8765236 DOI: 10.1128/aac.01163-21] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Research efforts to combat antimalarial drug resistance rely on quick, robust, and sensitive methods to genetically characterize Plasmodium falciparum parasites. We developed a single-nucleotide polymorphism (SNP)-based genotyping method that can assess 33 drug resistance-conferring SNPs in dhfr, dhps, pfmdr1, pfcrt, and k13 in nine PCRs, performed directly from P. falciparum cultures or infected blood. We also optimized multiplexed fragment analysis and gel electrophoresis-based microsatellite typing methods using a set of five markers that can distinguish 12 laboratory strains of diverse geographical and temporal origin. We demonstrate how these methods can be applied to screen for the multidrug-resistant KEL1/PLA1/PfPailin (KelPP) lineage that has been sweeping across the Greater Mekong Subregion, verify parasite in vitro SNP-editing, identify novel recombinant genetic cross progeny, or cluster strains to infer their geographical origins. Results were compared with Illumina-based whole-genome sequence analysis that provides the most detailed sequence information but is cost-prohibitive. These adaptable, simple, and inexpensive methods can be easily implemented into routine genotyping of P. falciparum parasites in both laboratory and field settings.
Collapse
|
34
|
The CDH1 c.1901C>T Variant: A Founder Variant in the Portuguese Population with Severe Impact in mRNA Splicing. Cancers (Basel) 2021; 13:cancers13174464. [PMID: 34503274 PMCID: PMC8430675 DOI: 10.3390/cancers13174464] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 08/26/2021] [Accepted: 08/30/2021] [Indexed: 12/25/2022] Open
Abstract
Simple Summary An unexpectedly high number of early-onset diffuse gastric and lobular breast cancer in apparently unrelated families carrying the same CDH1 c.1901C>T variant (formerly known as missense p.A634V) in Northern Portugal suggested a founder effect in this region. We demonstrated that c.1901C>T is a truncating variant triggered by cryptic splicing, calculated its mutational age, and characterized the tumour spectrum and age of onset in affected families. Abstract Hereditary diffuse gastric cancer (HDGC) caused by CDH1 variants predisposes to early-onset diffuse gastric (DGC) and lobular breast cancer (LBC). In Northern Portugal, the unusually high number of HDGC cases in unrelated families carrying the c.1901C>T variant (formerly known as p.A634V) suggested this as a CDH1-founder variant. We aimed to demonstrate that c.1901C>T is a bona fide truncating variant inducing cryptic splicing, to calculate the timing of a potential founder effect, and to characterize tumour spectrum and age of onset in carrying families. The impact in splicing was proven by using carriers’ RNA for PCR-cloning sequencing and allelic expression imbalance analysis with SNaPshot. Carriers and noncarriers were haplotyped for 12 polymorphic markers, and the decay of haplotype sharing (DHS) method was used to estimate the time to the most common ancestor of c.1901C>T. Clinical information from 58 carriers was collected and analysed. We validated the cryptic splice site within CDH1-exon 12, which was preferred over the canonical one in 100% of sequenced clones. Cryptic splicing induced an out-of-frame 37bp deletion in exon 12, premature truncation (p.Ala634ProfsTer7), and consequently RNA mediated decay. The haplotypes carrying the c.1901C>T variant were found to share a common ancestral estimated at 490 years (95% Confidence Interval 445–10,900). Among 58 carriers (27 males (M)–31 females (F); 13–83 years), DGC occurred in 11 (18.9%; 4M–7F; average age 33 ± 12) and LBC in 6 females (19.4%; average age 50 ± 8). Herein, we demonstrated that the c.1901C>T variant is a loss-of-function splice-site variant that underlies the first CDH1-founder effect in Portugal. Knowledge on this founder effect will drive genetic testing of this specific variant in HDGC families in this geographical region and allow intrafamilial penetrance analysis and better estimation of variant-associated tumour risks, disease age of onset, and spectrum.
Collapse
|
35
|
PolyG-DS: An ultrasensitive polyguanine tract-profiling method to detect clonal expansions and trace cell lineage. Proc Natl Acad Sci U S A 2021; 118:2023373118. [PMID: 34330826 DOI: 10.1073/pnas.2023373118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Polyguanine tracts (PolyGs) are short guanine homopolymer repeats that are prone to accumulating mutations when cells divide. This feature makes them especially suitable for cell lineage tracing, which has been exploited to detect and characterize precancerous and cancerous somatic evolution. PolyG genotyping, however, is challenging because of the inherent biochemical difficulties in amplifying and sequencing repetitive regions. To overcome this limitation, we developed PolyG-DS, a next-generation sequencing (NGS) method that combines the error-correction capabilities of duplex sequencing (DS) with enrichment of PolyG loci using CRISPR-Cas9-targeted genomic fragmentation. PolyG-DS markedly reduces technical artifacts by comparing the sequences derived from the complementary strands of each original DNA molecule. We demonstrate that PolyG-DS genotyping is accurate, reproducible, and highly sensitive, enabling the detection of low-frequency alleles (<0.01) in spike-in samples using a panel of only 19 PolyG markers. PolyG-DS replicated prior results based on PolyG fragment length analysis by capillary electrophoresis, and exhibited higher sensitivity for identifying clonal expansions in the nondysplastic colon of patients with ulcerative colitis. We illustrate the utility of this method for resolving the phylogenetic relationship among precancerous lesions in ulcerative colitis and for tracing the metastatic dissemination of ovarian cancer. PolyG-DS enables the study of tumor evolution without prior knowledge of tumor driver mutations and provides a tool to perform cost-effective and easily scalable ultra-accurate NGS-based PolyG genotyping for multiple applications in biology, genetics, and cancer research.
Collapse
|
36
|
Trost B, Loureiro LO, Scherer SW. Discovery of genomic variation across a generation. Hum Mol Genet 2021; 30:R174-R186. [PMID: 34296264 PMCID: PMC8490016 DOI: 10.1093/hmg/ddab209] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/09/2021] [Accepted: 07/19/2021] [Indexed: 11/12/2022] Open
Abstract
Over the past 30 years (the timespan of a generation), advances in genomics technologies have revealed tremendous and unexpected variation in the human genome and have provided increasingly accurate answers to long-standing questions of how much genetic variation exists in human populations and to what degree the DNA complement changes between parents and offspring. Tracking the characteristics of these inherited and spontaneous (or de novo) variations has been the basis of the study of human genetic disease. From genome-wide microarray and next-generation sequencing scans, we now know that each human genome contains over 3 million single nucleotide variants when compared with the ~ 3 billion base pairs in the human reference genome, along with roughly an order of magnitude more DNA—approximately 30 megabase pairs (Mb)—being ‘structurally variable’, mostly in the form of indels and copy number changes. Additional large-scale variations include balanced inversions (average of 18 Mb) and complex, difficult-to-resolve alterations. Collectively, ~1% of an individual’s genome will differ from the human reference sequence. When comparing across a generation, fewer than 100 new genetic variants are typically detected in the euchromatic portion of a child’s genome. Driven by increasingly higher-resolution and higher-throughput sequencing technologies, newer and more accurate databases of genetic variation (for instance, more comprehensive structural variation data and phasing of combinations of variants along chromosomes) of worldwide populations will emerge to underpin the next era of discovery in human molecular genetics.
Collapse
Affiliation(s)
- Brett Trost
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Livia O Loureiro
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Stephen W Scherer
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada.,McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
37
|
Huang Y, Liu C, Xiao C, Chen X, Han X, Yi S, Huang D. Mutation analysis of 28 autosomal short tandem repeats in the Chinese Han population. Mol Biol Rep 2021; 48:5363-5369. [PMID: 34213710 DOI: 10.1007/s11033-021-06522-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 06/25/2021] [Indexed: 11/26/2022]
Abstract
Short tandem repeats (STRs) have been extensively used in forensic genetics. However, according to previous studies, the mutation rates of STRs are relatively high and are affected by many factors. Therefore, it is important to analyze STR mutations and determine the influence of underlying factors on STR mutation rates. Mutation rates of 28 autosomal STRs were determined from 8708 paternity testing cases in the Chinese Han population, and the relationships between STR mutation rates and population, sex, age, allele length and heterozygosity were investigated. A total of 279 mutations were observed at 27 loci in a total of 233,530 meiosis cases, including 273 (97.8%) one-step, 5 (1.8%) two-step and 1 (0.4%) three-step mutations. The overall average mutation rate was 1.19 × 10-3 (95% CI 1.06 × 10-3 - 1.34 × 10-3) ranging from 0 (TPOX) to 2.79 × 10-3 (D13S325). Mutation rate comparisons revealed statistically significant differences at several STRs among populations. Paternal mutations occurred more frequently than maternal mutations, at a ratio of 6.04:1, and the mutation rate tended to increase with paternal age. Moreover, our study revealed a bias towards contraction mutations for long alleles and expansion mutations for short alleles. No obvious bias was observed in the overall mutation direction. In addition, STR loci with higher expected heterozygosity (Hexp) tended to have higher mutation rates. This work revealed the relationships between STR mutation rates and several influencing factors, providing useful data and information for further research on STR mutations in forensic genetics.
Collapse
Affiliation(s)
- Yujie Huang
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Cong Liu
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Chao Xiao
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaoying Chen
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xueli Han
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shaohua Yi
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Daixin Huang
- Department of Forensic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
38
|
Lei Y, Zhou Y, Price M, Song Z. Genome-wide characterization of microsatellite DNA in fishes: survey and analysis of their abundance and frequency in genome-specific regions. BMC Genomics 2021; 22:421. [PMID: 34098869 PMCID: PMC8186053 DOI: 10.1186/s12864-021-07752-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/24/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Microsatellite repeats are ubiquitous in organism genomes and play an important role in the chromatin organization, regulation of gene activity, recombination and DNA replication. Although microsatellite distribution patterns have been studied in most phylogenetic lineages, they are unclear in fish species. RESULTS Here, we present the first systematic examination of microsatellite distribution in coding and non-coding regions of 14 fish genomes. Our study showed that the number and type of microsatellites displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation and DNA replication slippage theories alone were insufficient to explain the distribution patterns. Our results showed that microsatellites are dominant in non-coding regions. The total number of microsatellites ranged from 78,378 to 1,012,084, and the relative density varied from 4925.76 bp/Mb to 25,401.97 bp/Mb. Overall, (A + T)-rich repeats were dominant. The dependence of repeat abundance on the length of the repeated unit (1-6 nt) showed a great similarity decrease, whereas more tri-nucleotide repeats were found in exonic regions than tetra-nucleotide repeats of most species. Moreover, the incidence of different repeated types appeared species- and genomic-specific. These results highlight potential mechanisms for maintaining microsatellite distribution, such as selective forces and mismatch repair systems. CONCLUSIONS Our data could be beneficial for the studies of genome evolution and microsatellite DNA evolutionary dynamics, and facilitate the exploration of microsatellites structural, function, composition mode and molecular markers development in these species.
Collapse
Affiliation(s)
- Yi Lei
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Yu Zhou
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Megan Price
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Zhaobin Song
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China.
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China.
| |
Collapse
|
39
|
Bonito M, D'Atanasio E, Trombetta B, Cannone F, Berti A, Cruciani F. Identification and molecular characterisation of an unusually short allele at the SE33 (ACTBP2) locus resulting in a putative tri-allelic pattern at a flanking marker. Forensic Sci Int Genet 2021; 54:102523. [PMID: 34006479 DOI: 10.1016/j.fsigen.2021.102523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 04/16/2021] [Accepted: 04/25/2021] [Indexed: 11/18/2022]
Affiliation(s)
- Maria Bonito
- Dipartimento di Biologia e Biotecnologie "C. Darwin", Sapienza Università di Roma, Rome, Italy
| | | | - Beniamino Trombetta
- Dipartimento di Biologia e Biotecnologie "C. Darwin", Sapienza Università di Roma, Rome, Italy
| | - Francesco Cannone
- Reparto CC Investigazioni Scientifiche di Roma, Sezione di Biologia, Rome, Italy
| | - Andrea Berti
- Reparto CC Investigazioni Scientifiche di Roma, Sezione di Biologia, Rome, Italy
| | - Fulvio Cruciani
- Dipartimento di Biologia e Biotecnologie "C. Darwin", Sapienza Università di Roma, Rome, Italy; Istituto di Biologia e Patologia Molecolari, CNR, Rome, Italy.
| |
Collapse
|
40
|
Mesas A, Baldi R, González BA, Burgi V, Chávez A, Johnson WE, Marín JC. Past and Recent Effects of Livestock Activity on the Genetic Diversity and Population Structure of Native Guanaco Populations of Arid Patagonia. Animals (Basel) 2021; 11:ani11051218. [PMID: 33922526 PMCID: PMC8146674 DOI: 10.3390/ani11051218] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/19/2021] [Accepted: 03/23/2021] [Indexed: 11/25/2022] Open
Abstract
Simple Summary Determining the impacts of human activities on natural populations is important for biodiversity conservation. In this paper, we study the past and more recent effects of urbanization and livestock activity on the genetic diversity and population structure of endemic guanaco populations of the arid Monte and Patagonian Steppe of central Argentina. Our results reveal that urbanization, the installation of fences, and the competition from sheep grazing coincided with the isolation of several guanaco populations, especially in areas with the highest intensity of livestock activity. However, our genetic analyses suggest that a more recent increase in connectivity among groups is occurring. Our results highlight the importance of implementing conservation management plans for natural populations in arid and human-intervened environments. Abstract Extensive livestock production and urbanization entail modifications of natural landscapes, including installation of fences, development of agriculture, urbanization of natural areas, and construction of roads and infrastructure that, together, impact native fauna. Here, we evaluate the diversity and genetic structure of endemic guanacos (Lama guanicoe) of the Monte and Patagonian Steppe of central Argentina, which have been reduced and displaced by sheep ranching and other impacts of human activities. Analyses of genetic variation of microsatellite loci and d-loop revealed high levels of genetic variation and latitudinal segregation of mitochondrial haplotypes. There were indications of at least two historical populations in the Monte and the Patagonian Steppe based on shared haplotypes and shared demographic history among localities. Currently, guanacos are structured into three groups that were probably reconnected relatively recently, possibly facilitated by a reduction of sheep and livestock in recent decades and a recovery of the guanaco populations. These results provide evidence of the genetic effects of livestock activity and urbanization on wild herbivore populations, which were possibly exacerbated by an arid environment with limited productive areas. The results highlight the importance of enacting conservation management plans to ensure the persistence of ancestral and ecologically functional populations of guanacos.
Collapse
Affiliation(s)
- Andrés Mesas
- Laboratorio de Genómica y Biodiversidad, Departamento de Ciencias Básicas, Universidad del Bio-Bío, Chillán 3780000, Chile; (A.M.); (A.C.)
| | - Ricardo Baldi
- Instituto Patagónico para el Estudio de los Ecosistemas Continentales, Centro Nacional Patagónico, CONICET, Puerto Madryn U9120 ACD, Argentina; (R.B.); (V.B.)
- Wildlife Conservation Society, Buenos Aires C1426 AKC, Argentina
- South American Camelids Specialist Group, SSC, IUCN, Santiago 8330015, Chile;
| | - Benito A. González
- South American Camelids Specialist Group, SSC, IUCN, Santiago 8330015, Chile;
- Laboratorio de Ecología de Vida Silvestre, Facultad de Ciencias Forestales y de la Conservación de la Naturaleza, Universidad de Chile, Santiago 8330015, Chile
| | - Virginia Burgi
- Instituto Patagónico para el Estudio de los Ecosistemas Continentales, Centro Nacional Patagónico, CONICET, Puerto Madryn U9120 ACD, Argentina; (R.B.); (V.B.)
- Wildlife Conservation Society, Buenos Aires C1426 AKC, Argentina
- South American Camelids Specialist Group, SSC, IUCN, Santiago 8330015, Chile;
| | - Alexandra Chávez
- Laboratorio de Genómica y Biodiversidad, Departamento de Ciencias Básicas, Universidad del Bio-Bío, Chillán 3780000, Chile; (A.M.); (A.C.)
| | - Warren E. Johnson
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, 1500 Remount Road, Front Royal, VA 22630, USA;
| | - Juan C. Marín
- Laboratorio de Genómica y Biodiversidad, Departamento de Ciencias Básicas, Universidad del Bio-Bío, Chillán 3780000, Chile; (A.M.); (A.C.)
- Correspondence:
| |
Collapse
|
41
|
Characterization of microsatellites in the endangered snow leopard based on the chromosome-level genome. MAMMAL RES 2021. [DOI: 10.1007/s13364-021-00563-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
42
|
Soh PXY, Hsu WT, Khatkar MS, Williamson P. Evaluation of genetic diversity and management of disease in Border Collie dogs. Sci Rep 2021; 11:6243. [PMID: 33737533 PMCID: PMC7973533 DOI: 10.1038/s41598-021-85262-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 02/28/2021] [Indexed: 01/31/2023] Open
Abstract
Maintaining genetic diversity in dog breeds is an important consideration for the management of inherited diseases. We evaluated genetic diversity in Border Collies using molecular and genealogical methods, and examined changes to genetic diversity when carriers for Trapped Neutrophil Syndrome (TNS) and Neuronal Ceroid Lipofuscinosis (NCL) are removed from the genotyped population. Genotype data for 255 Border Collies and a pedigree database of 83,996 Border Collies were used for analysis. Molecular estimates revealed a mean multi-locus heterozygosity (MLH) of 0.311 (SD 0.027), 20.79% of the genome consisted of runs of homozygosity (ROH ) > 1 Mb, effective population size (Ne) was 84.7, and mean inbreeding (F) was 0.052 (SD 0.083). For 227 genotyped Border Collies that had available pedigree information (GenoPed), molecular and pedigree estimates of diversity were compared. A reference population (dogs born between 2005 and 2015, inclusive; N = 13,523; RefPop) and their ancestors (N = 12,478) were used to evaluate the diversity of the population that are contributing to the current generation. The reference population had a Ne of 123.5, a mean F of 0.095 (SD 0.082), 2276 founders (f), 205.5 effective founders (fe), 28 effective ancestors (fa) and 10.65 (SD 2.82) founder genomes (Ng). Removing TNS and NCL carriers from the genotyped population had a small impact on diversity measures (ROH > 1 Mb, MLH, heterozygosity), however, there was a loss of > 10% minor allele frequency for 89 SNPs around the TNS mutation (maximum loss of 12.7%), and a loss of > 5% for 5 SNPs around the NCL mutation (maximum 5.18%). A common ancestor was identified for 38 TNS-affected dogs and 64 TNS carriers, and a different common ancestor was identified for 33 NCL-affected dogs and 28 carriers, with some overlap of prominent individuals between both pedigrees. Overall, Border Collies have a high level of genetic diversity compared to other breeds.
Collapse
Affiliation(s)
- Pamela Xing Yi Soh
- grid.1013.30000 0004 1936 834XSchool of Life and Environmental Sciences, Faculty of Science, The University of Sydney, Sydney, NSW 2006 Australia
| | - Wei Tse Hsu
- grid.1013.30000 0004 1936 834XSchool of Life and Environmental Sciences, Faculty of Science, The University of Sydney, Sydney, NSW 2006 Australia
| | - Mehar Singh Khatkar
- grid.1013.30000 0004 1936 834XSydney School of Veterinary Science, Faculty of Science, The University of Sydney, Sydney, NSW 2006 Australia
| | - Peter Williamson
- grid.1013.30000 0004 1936 834XSchool of Life and Environmental Sciences, Faculty of Science, The University of Sydney, Sydney, NSW 2006 Australia
| |
Collapse
|
43
|
Characterizing Repeats in Two Whole-Genome Amplification Methods in the Reniform Nematode Genome. Int J Genomics 2021; 2021:5532885. [PMID: 33748264 PMCID: PMC7960049 DOI: 10.1155/2021/5532885] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 02/26/2021] [Indexed: 11/17/2022] Open
Abstract
One of the major problems in the U.S. and global cotton production is the damage caused by the reniform nematode, Rotylenchulus reniformis. Amplification of DNA from single nematodes for further molecular analysis can be challenging sometimes. In this research, two whole-genome amplification (WGA) methods were evaluated for their efficiencies in DNA amplification from a single reniform nematode. The WGA was carried out using both REPLI-g Mini and Midi kits, and the GenomePlex single cell whole-genome amplification kit. Sequence analysis produced 4 Mb and 12 Mb of genomic sequences for the reniform nematode using REPLI-g and SIGMA libraries. These sequences were assembled into 28,784 and 24,508 contigs, respectively, for REPLI-g and SIGMA libraries. The highest repeats in both libraries were of low complexity, and the lowest for the REPLI-g library were for satellites and for the SIGMA library, RTE/BOV-B. The same kind of repeats were observed for both libraries; however, the SIGMA library had four other repeat elements (Penelope (long interspersed nucleotide element (LINE)), RTE/BOV-B (LINE), PiggyBac, and Mirage/P-element/Transib), which were not seen in the REPLI-g library. DNA transposons were also found in both libraries. Both reniform nematode 18S rRNA variants (RN_VAR1 and RN_VAR2) could easily be identified in both libraries. This research has therefore demonstrated the ability of using both WGA methods, in amplification of gDNA isolated from single reniform nematodes.
Collapse
|
44
|
Bereczky Z, Gindele R, Fiatal S, Speker M, Miklós T, Balogh L, Mezei Z, Szabó Z, Ádány R. Age and Origin of the Founder Antithrombin Budapest 3 (p.Leu131Phe) Mutation; Its High Prevalence in the Roma Population and Its Association With Cardiovascular Diseases. Front Cardiovasc Med 2021; 7:617711. [PMID: 33614741 PMCID: PMC7892435 DOI: 10.3389/fcvm.2020.617711] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 12/22/2020] [Indexed: 11/13/2022] Open
Abstract
Background: Antithrombin (AT) is one of the most important regulator of hemostasis. AT Budapest 3 (ATBp3) is a prevalent type II heparin-binding site (IIHBS) deficiency due to founder effect. Thrombosis is a complex disease including arterial (ATE) and venous thrombotic events (VTE) and the Roma population, the largest ethnic minority in Europe has increased susceptibility to these diseases partly due to their unfavorable genetic load. We aimed to calculate the age and origin of ATBp3 and to explore whether the frequency of it is higher in the Roma population as compared with the general population from the corresponding geographical area. We investigated the association of ATBp3 with thrombotic events in well-defined patients' populations in order to refine the recommendation when testing for ATBp3 is useful. Methods and Results: Prevalence of ATBp3, investigated in large samples (n = 1,000 and 1,185 for general Hungarian and Roma populations, respectively) was considerably high, almost 3%, among Roma and the founder effect was confirmed in their samples, while it was absent in the Hungarian general population. Age of ATBp3—as calculated by analysis of 8 short tandem repeat sequences surrounding SERPINC1—was dated back to XVII Century, when Roma migration in Central and Eastern Europe occurred. In our IIHBS cohort (n = 230), VTE was registered in almost all ATBp3 homozygotes (93%) and in 44% of heterozygotes. ATE occurred with lower frequency in ATBp3 (around 6%); it was rather associated with AT Basel (44%). All patients with ATE were young at the time of diagnosis. Upon investigating consecutive young (<40 years) patients with ATE (n = 92) and VTE (n = 110), the presence of ATBp3 was remarkable. Conclusions: ATBp3, a 400-year-old founder mutation is prevalent in Roma population and its Roma origin can reasonably be assumed. By the demonstration of the presence of ATBp3 in ATE patients, we draw the attention to consider type IIHBS AT deficiency in the background of not only VTE but also ATE, especially in selected populations as young patients without advanced atherosclerosis. We recommend including the investigation of ATBp3 as part of thrombosis risk assessment and stratification in Roma individuals.
Collapse
Affiliation(s)
- Zsuzsanna Bereczky
- Division of Clinical Laboratory Science, Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Réka Gindele
- Division of Clinical Laboratory Science, Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Szilvia Fiatal
- Department of Public Health and Epidemiology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Marianna Speker
- Division of Clinical Laboratory Science, Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Tünde Miklós
- Division of Clinical Laboratory Science, Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - László Balogh
- Department of Cardiology and Cardiovascular Surgery, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Zoltán Mezei
- Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Zsuzsanna Szabó
- Division of Clinical Laboratory Science, Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Róza Ádány
- Department of Public Health and Epidemiology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary.,Magyar Tudományos Akadémia - Debrecen Public Health Research Group, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
45
|
Ghazi MG, Sharma SP, Tuboi C, Angom S, Gurumayum T, Nigam P, Hussain SA. Population genetics and evolutionary history of the endangered Eld's deer (Rucervus eldii) with implications for planning species recovery. Sci Rep 2021; 11:2564. [PMID: 33510319 PMCID: PMC7844053 DOI: 10.1038/s41598-021-82183-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 01/18/2021] [Indexed: 01/30/2023] Open
Abstract
Eld's deer (Rucervus eldii) with three recognised subspecies (R. e. eldii, R. e. thamin, and R. e. siamensis) represents one of the most threatened cervids found in Southeast Asia. The species has experienced considerable range contractions and local extinctions owing to habitat loss and fragmentation, hunting, and illegal trade across its distribution range over the last century. Understanding the patterns of genetic variation is crucial for planning effective conservation strategies. This study investigated the phylogeography, divergence events and systematics of Eld's deer subspecies using the largest mtDNA dataset compiled to date. We also analysed the genetic structure and demographic history of R. e. eldii using 19 microsatellite markers. Our results showed that R. e. siamensis exhibits two divergent mtDNA lineages (mainland and Hainan Island), which diverged around 0.2 Mya (95% HPD 0.1-0.2), possibly driven by the fluctuating sea levels of the Early Holocene period. The divergence between R. e. eldii and R. e. siamensis occurred around 0.4 Mya (95% HPD 0.3-0.5), potentially associated with the adaptations to warm and humid climate with open grassland vegetation that predominated the region. Furthermore, R. e. eldii exhibits low levels of genetic diversity and small contemporary effective population size (median = 7, 4.7-10.8 at 95% CI) with widespread historical genetic bottlenecks which accentuates its vulnerability to inbreeding and extinction. Based on the observed significant evolutionary and systematic distance between Eld's deer and other species of the genus Rucervus, we propose to classify Eld's deer (Cervus eldii) in the genus Cervus, which is in congruent with previous phylogenetic studies. This study provides important conservation implications required to direct the ongoing population recovery programs and planning future conservation strategies.
Collapse
Affiliation(s)
| | - Surya Prasad Sharma
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India
| | - Chongpi Tuboi
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India
| | - Sangeeta Angom
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India
| | - Tennison Gurumayum
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India
| | - Parag Nigam
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India
| | - Syed Ainul Hussain
- Wildlife Institute of India, Chandrabani, Post Box #18, Dehra Dun, Uttarakhand, 248002, India.
| |
Collapse
|
46
|
Mitra I, Huang B, Mousavi N, Ma N, Lamkin M, Yanicky R, Shleizer-Burko S, Lohmueller KE, Gymrek M. Patterns of de novo tandem repeat mutations and their role in autism. Nature 2021; 589:246-250. [PMID: 33442040 PMCID: PMC7810352 DOI: 10.1038/s41586-020-03078-7] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 11/23/2020] [Indexed: 12/19/2022]
Abstract
Autism Spectrum Disorder (ASD) is an early onset developmental disorder characterized by deficits in communication and social interaction and restrictive or repetitive behaviors1,2. Family studies demonstrate that ASD has a significant genetic basis with contributions both from inherited and de novo variants3,4. It has been estimated that de novo mutations may contribute to 30% of all simplex cases, in which only a single child is affected per family5. Tandem repeats (TRs), defined here as 1-20bp sequences repeated consecutively, comprise one of the largest sources of de novo mutations in humans6. TR expansions are implicated in dozens of neurological and psychiatric disorders7. Yet, de novo TR mutations have not been characterized on a genome-wide scale, and their contribution to ASD remains unexplored. We develop novel bioinformatics methods for identifying and prioritizing de novo TR mutations from sequencing data and then perform a genome-wide characterization of de novo TR mutations in ASD-affected probands and unaffected siblings. Compared to recent work on TRs in ASD8, we explicitly infer mutation events and their precise changes in repeat number, and primarily focus on more prevalent stepwise copy number changes rather than large expansions. Our results demonstrate a significant genome-wide excess of TR mutations in ASD probands. Mutations in probands tend to be larger, enriched in fetal brain regulatory regions, and predicted to be more evolutionarily deleterious. Overall, our results highlight the importance of considering repeat variants in future studies of de novo mutations.
Collapse
Affiliation(s)
- Ileena Mitra
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Bonnie Huang
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA, USA
| | - Nichole Ma
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Michael Lamkin
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Richard Yanicky
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | | | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA, USA. .,Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
| | - Melissa Gymrek
- Department of Medicine, University of California San Diego, La Jolla, CA, USA. .,Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
47
|
Gupta H, Chandratre K, Sinha S, Huang T, Wu X, Cui J, Zhang MQ, Wang SM. Highly diversified core promoters in the human genome and their effects on gene expression and disease predisposition. BMC Genomics 2020; 21:842. [PMID: 33256598 PMCID: PMC7706239 DOI: 10.1186/s12864-020-07222-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 11/09/2020] [Indexed: 12/14/2022] Open
Abstract
Background Core promoter controls transcription initiation. However, little is known for core promoter diversity in the human genome and its relationship with diseases. We hypothesized that as a functional important component in the genome, the core promoter in the human genome could be under evolutionary selection, as reflected by its highly diversification in order to adjust gene expression for better adaptation to the different environment. Results Applying the “Exome-based Variant Detection in Core-promoters” method, we analyzed human core-promoter diversity by using the 2682 exome data sets of 25 worldwide human populations sequenced by the 1000 Genome Project. Collectively, we identified 31,996 variants in the core promoter region (− 100 to + 100) of 12,509 human genes (https://dbhcpd.fhs.um.edu.mo). Analyzing the rich variation data identified highly ethnic-specific patterns of core promoter variation between different ethnic populations, the genes with highly variable core promoters, the motifs affected by the variants, and their involved functional pathways. eQTL test revealed that 12% of core promoter variants can significantly alter gene expression level. Comparison with GWAS data we located 163 variants as the GWAS identified traits associated with multiple diseases, half of these variants can alter gene expression. Conclusion Data from our study reals the highly diversified nature of core promoter in the human genome, and highlights that core promoter variation could play important roles not only in gene expression regulation but also in disease predisposition. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-020-07222-5.
Collapse
Affiliation(s)
- Hemant Gupta
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Khyati Chandratre
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Siddharth Sinha
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Teng Huang
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Xiaobing Wu
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Jian Cui
- Eppley Institute for Cancer Research, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Michael Q Zhang
- Department of Biological Sciences, Center for Systems Biology, University of Texas at Dallas, Richardson, TX, 75080, USA
| | - San Ming Wang
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China.
| |
Collapse
|
48
|
Balzano E, Pelliccia F, Giunta S. Genome (in)stability at tandem repeats. Semin Cell Dev Biol 2020; 113:97-112. [PMID: 33109442 DOI: 10.1016/j.semcdb.2020.10.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 09/26/2020] [Accepted: 10/10/2020] [Indexed: 12/12/2022]
Abstract
Repeat sequences account for over half of the human genome and represent a significant source of variation that underlies physiological and pathological states. Yet, their study has been hindered due to limitations in short-reads sequencing technology and difficulties in assembly. A important category of repetitive DNA in the human genome is comprised of tandem repeats (TRs), where repetitive units are arranged in a head-to-tail pattern. Compared to other regions of the genome, TRs carry between 10 and 10,000 fold higher mutation rate. There are several mutagenic mechanisms that can give rise to this propensity toward instability, but their precise contribution remains speculative. Given the high degree of homology between these sequences and their arrangement in tandem, once damaged, TRs have an intrinsic propensity to undergo aberrant recombination with non-allelic exchange and generate harmful rearrangements that may undermine the stability of the entire genome. The dynamic mutagenesis at TRs has been found to underlie individual polymorphism associated with neurodegenerative and neuromuscular disorders, as well as complex genetic diseases like cancer and diabetes. Here, we review our current understanding of the surveillance and repair mechanisms operating within these regions, and we describe how alterations in these protective processes can readily trigger mutational signatures found at TRs, ultimately resulting in the pathological correlation between TRs instability and human diseases. Finally, we provide a viewpoint to counter the detrimental effects that TRs pose in light of their selection and conservation, as important drivers of human evolution.
Collapse
Affiliation(s)
- Elisa Balzano
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Franca Pelliccia
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Simona Giunta
- The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA; Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy.
| |
Collapse
|
49
|
Camacho-Sanchez M, Velo-Antón G, Hanson JO, Veríssimo A, Martínez-Solano Í, Marques A, Moritz C, Carvalho SB. Comparative assessment of range-wide patterns of genetic diversity and structure with SNPs and microsatellites: A case study with Iberian amphibians. Ecol Evol 2020; 10:10353-10363. [PMID: 33072264 PMCID: PMC7548196 DOI: 10.1002/ece3.6670] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 07/22/2020] [Indexed: 11/11/2022] Open
Abstract
Reduced representation genome sequencing has popularized the application of single nucleotide polymorphisms (SNPs) to address evolutionary and conservation questions in nonmodel organisms. Patterns of genetic structure and diversity based on SNPs often diverge from those obtained with microsatellites to different degrees, but few studies have explicitly compared their performance under similar sampling regimes in a shared analytical framework. We compared range‐wide patterns of genetic structure and diversity in two amphibians endemic to the Iberian Peninsula: Hyla molleri and Pelobates cultripes, based on microsatellite (18 and 14 loci) and SNP (15,412 and 33,140 loci) datasets of comparable sample size and spatial extent. Model‐based clustering analyses with STRUCTURE revealed minor differences in genetic structure between marker types, but inconsistent values of the optimal number of populations (K) inferred. SNPs yielded more repeatable and less admixed ancestries with increasing K compared to microsatellites. Genetic diversity was weakly correlated between marker types, with SNPs providing a better representation of southern refugia and of gradients of genetic diversity congruent with the demographic history of both species. Our results suggest that the larger number of loci in a SNP dataset can provide more reliable inferences of patterns of genetic structure and diversity than a typical microsatellite dataset, at least at the spatial and temporal scales investigated.
Collapse
Affiliation(s)
- Miguel Camacho-Sanchez
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Guillermo Velo-Antón
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Jeffrey O Hanson
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Ana Veríssimo
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | | | - Adam Marques
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Craig Moritz
- Centre for Biodiversity Analysis and Research School of Biology The Australian National University Canberra ACT Australia
| | - Sílvia B Carvalho
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| |
Collapse
|
50
|
Mathema VB, Nakeesathit S, White NJ, Dondorp AM, Imwong M. Genome-wide microsatellite characteristics of five human Plasmodium species, focusing on Plasmodium malariae and P. ovale curtisi. Parasite 2020; 27:34. [PMID: 32410726 PMCID: PMC7227371 DOI: 10.1051/parasite/2020034] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 04/30/2020] [Indexed: 12/16/2022] Open
Abstract
Microsatellites can be utilized to explore genotypes, population structure, and other genomic features of eukaryotes. Systematic characterization of microsatellites has not been a focus for several species of Plasmodium, including P. malariae and P. ovale, as the majority of malaria elimination programs are focused on P. falciparum and to a lesser extent P. vivax. Here, five human malaria species (P. falciparum, P. vivax, P. malariae, P. ovale curtisi, and P. knowlesi) were investigated with the aim of conducting in-depth categorization of microsatellites for P. malariae and P. ovale curtisi. Investigation of reference genomes for microsatellites with unit motifs of 1-10 base pairs indicates high diversity among the five Plasmodium species. Plasmodium malariae, with the largest genome size, displays the second highest microsatellite density (1421 No./Mbp; 5% coverage) next to P. falciparum (3634 No./Mbp; 12% coverage). The lowest microsatellite density was observed in P. vivax (773 No./Mbp; 2% coverage). A, AT, and AAT are the most commonly repeated motifs in the Plasmodium species. For P. malariae and P. ovale curtisi, microsatellite-related sequences are observed in approximately 18-29% of coding sequences (CDS). Lysine, asparagine, and glutamic acids are most frequently coded by microsatellite-related CDS. The majority of these CDS could be related to the gene ontology terms "cell parts," "binding," "developmental processes," and "metabolic processes." The present study provides a comprehensive overview of microsatellite distribution and can assist in the planning and development of potentially useful genetic tools for further investigation of P. malariae and P. ovale curtisi epidemiology.
Collapse
Affiliation(s)
- Vivek Bhakta Mathema
- Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University 10400 Bangkok Thailand
| | - Supatchara Nakeesathit
- Mahidol–Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University 10400 Bangkok Thailand
| | - Nicholas J. White
- Mahidol–Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University 10400 Bangkok Thailand
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford OX1 2JD Oxford United Kingdom
| | - Arjen M. Dondorp
- Mahidol–Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University 10400 Bangkok Thailand
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford OX1 2JD Oxford United Kingdom
| | - Mallika Imwong
- Department of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University 10400 Bangkok Thailand
| |
Collapse
|