1
|
Prentout D, Bykova D, Hoge C, Hooper DM, McDiarmid CS, Wu F, Griffith SC, de Manuel M, Przeworski M. Conservation of mutation and recombination parameters between mammals and zebra finch. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.05.611523. [PMID: 39282267 PMCID: PMC11398497 DOI: 10.1101/2024.09.05.611523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
Most of our understanding of the fundamental processes of mutation and recombination stems from a handful of disparate model organisms and pedigree studies of mammals, with little known about other vertebrates. To gain a broader comparative perspective, we focused on the zebra finch (Taeniopygia castanotis), which, like other birds, differs from mammals in its karyotype (which includes many micro-chromosomes), in the mechanism by which recombination is directed to the genome, and in aspects of ontogenesis. We collected genome sequences from three generation pedigrees that provide information about 80 meioses, inferring 202 single-point de novo mutations, 1,174 crossovers, and 275 non-crossovers. On that basis, we estimated a sex-averaged mutation rate of 5.0 × 10-9 per base pair per generation, on par with mammals that have a similar generation time. Also as in mammals, we found a paternal germline mutation bias at later stages of gametogenesis (of 1.7 to 1) but no discernible difference between sexes in early development. We also examined recombination patterns, and found that the sex-averaged crossover rate on macro-chromosomes (1.05 cM/Mb) is again similar to values observed in mammals, as is the spatial distribution of crossovers, with a pronounced enrichment near telomeres. In contrast, non-crossover rates are more uniformly distributed. On micro-chromosomes, sex-averaged crossover rates are substantially higher (4.21 cM/Mb), as expected from crossover homeostasis, and both crossover and non-crossover events are more uniformly distributed. At a finer scale, recombination events overlap CpG islands more often than expected by chance, as expected in the absence of PRDM9. Despite differences in the mechanism by which recombination events are specified and the presence of many micro-chromosomes, estimates of the degree of GC-biased gene conversion (59%), the mean non-crossover conversion tract length (~23 bp), and the non-crossover to crossover ratio (6.7:1) are all comparable to those reported in primates and mice. The conservation of mutation and recombination properties from zebra finch to mammals suggest that these processes have evolved under stabilizing selection.
Collapse
Affiliation(s)
| | - Daria Bykova
- Dept. of Biological Sciences, Columbia University
| | - Carla Hoge
- Dept. of Biological Sciences, Columbia University
| | - Daniel M Hooper
- Institute for Comparative Genomics and Richard Gilder Graduate School, American Museum of Natural History, New York, New York, USA
| | - Callum S McDiarmid
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Felix Wu
- Dept. of Systems Biology, Columbia University
| | - Simon C Griffith
- School of Natural Sciences, Macquarie University, Sydney, New South Wales, Australia
| | | | - Molly Przeworski
- Dept. of Biological Sciences, Columbia University
- Dept. of Systems Biology, Columbia University
| |
Collapse
|
2
|
Iyengar BR, Grandchamp A, Bornberg-Bauer E. How antisense transcripts can evolve to encode novel proteins. Nat Commun 2024; 15:6187. [PMID: 39043684 PMCID: PMC11266595 DOI: 10.1038/s41467-024-50550-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 07/12/2024] [Indexed: 07/25/2024] Open
Abstract
Protein coding features can emerge de novo in non coding transcripts, resulting in emergence of new protein coding genes. Studies across many species show that a large fraction of evolutionarily novel non-coding RNAs have an antisense overlap with protein coding genes. The open reading frames (ORFs) in these antisense RNAs could also overlap with existing ORFs. In this study, we investigate how the evolution an ORF could be constrained by its overlap with an existing ORF in three different reading frames. Using a combination of mathematical modeling and genome/transcriptome data analysis in two different model organisms, we show that antisense overlap can increase the likelihood of ORF emergence and reduce the likelihood of ORF loss, especially in one of the three reading frames. In addition to rationalising the repeatedly reported prevalence of de novo emerged genes in antisense transcripts, our work also provides a generic modeling and an analytical framework that can be used to understand evolution of antisense genes.
Collapse
Affiliation(s)
- Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, Münster, Germany.
| | - Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, Münster, Germany
- Aix-Marseille Université, INSERM, TAGC, Marseille, France
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Max-Planck-Ring 5, Tübingen, Germany
| |
Collapse
|
3
|
Lynch M, Ali F, Lin T, Wang Y, Ni J, Long H. The divergence of mutation rates and spectra across the Tree of Life. EMBO Rep 2023; 24:e57561. [PMID: 37615267 PMCID: PMC10561183 DOI: 10.15252/embr.202357561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/25/2023] Open
Abstract
Owing to advances in genome sequencing, genome stability has become one of the most scrutinized cellular traits across the Tree of Life. Despite its centrality to all things biological, the mutation rate (per nucleotide site per generation) ranges over three orders of magnitude among species and several-fold within individual phylogenetic lineages. Within all major organismal groups, mutation rates scale negatively with the effective population size of a species and with the amount of functional DNA in the genome. This relationship is most parsimoniously explained by the drift-barrier hypothesis, which postulates that natural selection typically operates to reduce mutation rates until further improvement is thwarted by the power of random genetic drift. Despite this constraint, the molecular mechanisms underlying DNA replication fidelity and repair are free to wander, provided the performance of the entire system is maintained at the prevailing level. The evolutionary flexibility of the mutation rate bears on the resolution of several prior conundrums in phylogenetic and population-genetic analysis and raises challenges for future applications in these areas.
Collapse
Affiliation(s)
- Michael Lynch
- Biodesign Center for Mechanisms of EvolutionArizona State UniversityTempeAZUSA
| | - Farhan Ali
- Biodesign Center for Mechanisms of EvolutionArizona State UniversityTempeAZUSA
| | - Tongtong Lin
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Yaohai Wang
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Jiahao Ni
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Hongan Long
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| |
Collapse
|
4
|
Lucaci AG, Zehr JD, Enard D, Thornton JW, Kosakovsky Pond SL. Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses. Mol Biol Evol 2023; 40:msad150. [PMID: 37395787 PMCID: PMC10336034 DOI: 10.1093/molbev/msad150] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/15/2023] [Accepted: 06/26/2023] [Indexed: 07/04/2023] Open
Abstract
Inference and interpretation of evolutionary processes, in particular of the types and targets of natural selection affecting coding sequences, are critically influenced by the assumptions built into statistical models and tests. If certain aspects of the substitution process (even when they are not of direct interest) are presumed absent or are modeled with too crude of a simplification, estimates of key model parameters can become biased, often systematically, and lead to poor statistical performance. Previous work established that failing to accommodate multinucleotide (or multihit, MH) substitutions strongly biases dN/dS-based inference towards false-positive inferences of diversifying episodic selection, as does failing to model variation in the rate of synonymous substitution (SRV) among sites. Here, we develop an integrated analytical framework and software tools to simultaneously incorporate these sources of evolutionary complexity into selection analyses. We found that both MH and SRV are ubiquitous in empirical alignments, and incorporating them has a strong effect on whether or not positive selection is detected (1.4-fold reduction) and on the distributions of inferred evolutionary rates. With simulation studies, we show that this effect is not attributable to reduced statistical power caused by using a more complex model. After a detailed examination of 21 benchmark alignments and a new high-resolution analysis showing which parts of the alignment provide support for positive selection, we show that MH substitutions occurring along shorter branches in the tree explain a significant fraction of discrepant results in selection detection. Our results add to the growing body of literature which examines decades-old modeling assumptions (including MH) and finds them to be problematic for comparative genomic data analysis. Because multinucleotide substitutions have a significant impact on natural selection detection even at the level of an entire gene, we recommend that selection analyses of this type consider their inclusion as a matter of routine. To facilitate this procedure, we developed, implemented, and benchmarked a simple and well-performing model testing selection detection framework able to screen an alignment for positive selection with two biologically important confounding processes: site-to-site synonymous rate variation, and multinucleotide instantaneous substitutions.
Collapse
Affiliation(s)
- Alexander G Lucaci
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Jordan D Zehr
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona
| | - Joseph W Thornton
- Department of Human Genetics, University of Chicago, Chicago, Illinois
- Department of Ecology & Evolution, University of Chicago, Chicago, Illinois
| | | |
Collapse
|
5
|
Iyengar BR, Bornberg-Bauer E. Neutral Models of De Novo Gene Emergence Suggest that Gene Evolution has a Preferred Trajectory. Mol Biol Evol 2023; 40:msad079. [PMID: 37011142 PMCID: PMC10118301 DOI: 10.1093/molbev/msad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/01/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
New protein coding genes can emerge from genomic regions that previously did not contain any genes, via a process called de novo gene emergence. To synthesize a protein, DNA must be transcribed as well as translated. Both processes need certain DNA sequence features. Stable transcription requires promoters and a polyadenylation signal, while translation requires at least an open reading frame. We develop mathematical models based on mutation probabilities, and the assumption of neutral evolution, to find out how quickly genes emerge and are lost. We also investigate the effect of the order by which DNA features evolve, and if sequence composition is biased by mutation rate. We rationalize how genes are lost much more rapidly than they emerge, and how they preferentially arise in regions that are already transcribed. Our study not only answers some fundamental questions on the topic of de novo emergence but also provides a modeling framework for future studies.
Collapse
Affiliation(s)
- Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| |
Collapse
|
6
|
Silva SR, Miranda VFO, Michael TP, Płachno BJ, Matos RG, Adamec L, Pond SLK, Lucaci AG, Pinheiro DG, Varani AM. The phylogenomics and evolutionary dynamics of the organellar genomes in carnivorous Utricularia and Genlisea species (Lentibulariaceae). Mol Phylogenet Evol 2023; 181:107711. [PMID: 36693533 DOI: 10.1016/j.ympev.2023.107711] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 01/13/2023] [Accepted: 01/18/2023] [Indexed: 01/22/2023]
Abstract
Utricularia and Genlisea are highly specialized carnivorous plants whose phylogenetic history has been poorly explored using phylogenomic methods. Additional sampling and genomic data are needed to advance our phylogenetic and taxonomic knowledge of this group of plants. Within a comparative framework, we present a characterization of plastome (PT) and mitochondrial (MT) genes of 26 Utricularia and six Genlisea species, with representatives of all subgenera and growth habits. All PT genomes maintain similar gene content, showing minor variation across the genes located between the PT junctions. One exception is a major variation related to different patterns in the presence and absence of ndh genes in the small single copy region, which appears to follow the phylogenetic history of the species rather than their lifestyle. All MT genomes exhibit similar gene content, with most differences related to a lineage-specific pseudogenes. We find evidence for episodic positive diversifying selection in PT and for most of the Utricularia MT genes that may be related to the current hypothesis that bladderworts' nuclear DNA is under constant ROS oxidative DNA damage and unusual DNA repair mechanisms, or even low fidelity polymerase that bypass lesions which could also be affecting the organellar genomes. Finally, both PT and MT phylogenetic trees were well resolved and highly supported, providing a congruent phylogenomic hypothesis for Utricularia and Genlisea clade given the study sampling.
Collapse
Affiliation(s)
- Saura R Silva
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Agricultural and Environmental Biotechnology, Campus Jaboticabal, CEP 14884-900 SP, Brazil.
| | - Vitor F O Miranda
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Biology, Laboratory of Plant Systematics, Campus Jaboticabal, CEP 14884-900 SP, Brazil.
| | - Todd P Michael
- Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA.
| | - Bartosz J Płachno
- Department of Plant Cytology and Embryology, Institute of Botany, Faculty of Biology, Jagiellonian University in Kraków, Gronostajowa 9 St., 30-387 Cracow, Poland.
| | - Ramon G Matos
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Biology, Laboratory of Plant Systematics, Campus Jaboticabal, CEP 14884-900 SP, Brazil.
| | - Lubomir Adamec
- Department of Experimental and Functional Morphology, Institute of Botany CAS, Dukelská 135, CZ-379 01 Třeboň, Czech Republic.
| | - Sergei L K Pond
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA.
| | - Alexander G Lucaci
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA.
| | - Daniel G Pinheiro
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Agricultural and Environmental Biotechnology, Campus Jaboticabal, CEP 14884-900 SP, Brazil.
| | - Alessandro M Varani
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Agricultural and Environmental Biotechnology, Campus Jaboticabal, CEP 14884-900 SP, Brazil.
| |
Collapse
|
7
|
Gupta MK, Vadde R. Next-generation development and application of codon model in evolution. Front Genet 2023; 14:1091575. [PMID: 36777719 PMCID: PMC9911445 DOI: 10.3389/fgene.2023.1091575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/17/2023] [Indexed: 01/28/2023] Open
Abstract
To date, numerous nucleotide, amino acid, and codon substitution models have been developed to estimate the evolutionary history of any sequence/organism in a more comprehensive way. Out of these three, the codon substitution model is the most powerful. These models have been utilized extensively to detect selective pressure on a protein, codon usage bias, ancestral reconstruction and phylogenetic reconstruction. However, due to more computational demanding, in comparison to nucleotide and amino acid substitution models, only a few studies have employed the codon substitution model to understand the heterogeneity of the evolutionary process in a genome-scale analysis. Hence, there is always a question of how to develop more robust but less computationally demanding codon substitution models to get more accurate results. In this review article, the authors attempted to understand the basis of the development of different types of codon-substitution models and how this information can be utilized to develop more robust but less computationally demanding codon substitution models. The codon substitution model enables to detect selection regime under which any gene or gene region is evolving, codon usage bias in any organism or tissue-specific region and phylogenetic relationship between different lineages more accurately than nucleotide and amino acid substitution models. Thus, in the near future, these codon models can be utilized in the field of conservation, breeding and medicine.
Collapse
|
8
|
Wang Y, Zhang L, Zhou Y, Ma W, Li M, Guo P, Feng L, Fu C. Using landscape genomics to assess local adaptation and genomic vulnerability of a perennial herb Tetrastigma hemsleyanum (Vitaceae) in subtropical China. Front Genet 2023; 14:1150704. [PMID: 37144128 PMCID: PMC10151583 DOI: 10.3389/fgene.2023.1150704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 04/04/2023] [Indexed: 05/06/2023] Open
Abstract
Understanding adaptive genetic variation of plant populations and their vulnerabilities to climate change are critical to preserve biodiversity and subsequent management interventions. To this end, landscape genomics may represent a cost-efficient approach for investigating molecular signatures underlying local adaptation. Tetrastigma hemsleyanum is, in its native habitat, a widespread perennial herb of warm-temperate evergreen forest in subtropical China. Its ecological and medicinal values constitute a significant revenue for local human populations and ecosystem. Using 30,252 single nucleotide polymorphisms (SNPs) derived from reduced-representation genome sequencing in 156 samples from 24 sites, we conducted a landscape genomics study of the T. hemsleyanum to elucidate its genomic variation across multiple climate gradients and genomic vulnerability to future climate change. Multivariate methods identified that climatic variation explained more genomic variation than that of geographical distance, which implied that local adaptation to heterogeneous environment might represent an important source of genomic variation. Among these climate variables, winter precipitation was the strongest predictor of the contemporary genetic structure. F ST outlier tests and environment association analysis totally identified 275 candidate adaptive SNPs along the genetic and environmental gradients. SNP annotations of these putatively adaptive loci uncovered gene functions associated with modulating flowering time and regulating plant response to abiotic stresses, which have implications for breeding and other special agricultural aims on the basis of these selection signatures. Critically, modelling revealed that the high genomic vulnerability of our focal species via a mismatch between current and future genotype-environment relationships located in central-northern region of the T. hemsleyanum's range, where populations require proactive management efforts such as assistant adaptation to cope with ongoing climate change. Taken together, our results provide robust evidence of local climate adaption for T. hemsleyanum and further deepen our understanding of adaptation basis of herbs in subtropical China.
Collapse
Affiliation(s)
- Yihan Wang
- College of Life Sciences, Henan Agricultural University, Zhengzhou, China
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
| | - Lin Zhang
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
- College of Landscape Architecture and Art, Henan Agricultural University, Zhengzhou, China
| | - Yuchao Zhou
- College of Life Sciences, Henan Agricultural University, Zhengzhou, China
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
| | - Wenxin Ma
- College of Life Sciences, Henan Agricultural University, Zhengzhou, China
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
| | - Manyu Li
- College of Life Sciences, Henan Agricultural University, Zhengzhou, China
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
| | - Peng Guo
- College of Life Sciences, Henan Agricultural University, Zhengzhou, China
- Henan Engineering Research Center for Osmanthus Germplasm Innovation and Resource Utilization, Henan Agricultural University, Zhengzhou, China
- *Correspondence: Peng Guo, ; Li Feng,
| | - Li Feng
- School of Pharmacy, Xi’an Jiaotong University, Xi’an, China
- *Correspondence: Peng Guo, ; Li Feng,
| | - Chengxin Fu
- Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
9
|
Parada-Márquez JF, Maldonado-Rodriguez ND, Triana-Fonseca P, Contreras-Bravo NC, Calderón-Ospina CA, Restrepo CM, Morel A, Ortega-Recalde OJ, Silgado-Guzmán DF, Angulo-Aguado M, Fonseca-Mendoza DJ. Pharmacogenomic profile of actionable molecular variants related to drugs commonly used in anesthesia: WES analysis reveals new mutations. Front Pharmacol 2023; 14:1047854. [PMID: 37021041 PMCID: PMC10069477 DOI: 10.3389/fphar.2023.1047854] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 03/06/2023] [Indexed: 04/07/2023] Open
Abstract
Background: Genetic interindividual variability is associated with adverse drug reactions (ADRs) and affects the response to common drugs used in anesthesia. Despite their importance, these variants remain largely underexplored in Latin-American countries. This study describes rare and common variants found in genes related to metabolism of analgesic and anaesthetic drug in the Colombian population. Methods: We conducted a study that included 625 Colombian healthy individuals. We generated a subset of 14 genes implicated in metabolic pathways of common medications used in anesthesia and assessed them by whole-exome sequencing (WES). Variants were filtered using two pipelines: A) novel or rare (minor allele frequency-MAF <1%) variants including missense, loss-of-function (LoF, e.g., frameshift, nonsense), and splice site variants with potential deleterious effect and B) clinically validated variants described in the PharmGKB (categories 1, 2 and 3) and/or ClinVar databases. For rare and novel missense variants, we applied an optimized prediction framework (OPF) to assess the functional impact of pharmacogenetic variants. Allelic, genotypic frequencies and Hardy-Weinberg equilibrium were calculated. We compare our allelic frequencies with these from populations described in the gnomAD database. Results: Our study identified 148 molecular variants potentially related to variability in the therapeutic response to 14 drugs commonly used in anesthesiology. 83.1% of them correspond to rare and novel missense variants classified as pathogenic according to the pharmacogenetic optimized prediction framework, 5.4% were loss-of-function (LoF), 2.7% led to potential splicing alterations and 8.8% were assigned as actionable or informative pharmacogenetic variants. Novel variants were confirmed by Sanger sequencing. Allelic frequency comparison showed that the Colombian population has a unique pharmacogenomic profile for anesthesia drugs with some allele frequencies different from other populations. Conclusion: Our results demonstrated high allelic heterogeneity among the analyzed sampled, enriched by rare (91.2%) variants in pharmacogenes related to common drugs used in anesthesia. The clinical implications of these results highlight the importance of implementation of next-generation sequencing data into pharmacogenomic approaches and personalized medicine.
Collapse
Affiliation(s)
| | | | - Paula Triana-Fonseca
- Department of Molecular Diagnosis, Genética Molecular de Colombia SAS, Bogotá, Colombia
| | - Nora Constanza Contreras-Bravo
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
| | - Carlos Alberto Calderón-Ospina
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
| | - Carlos M. Restrepo
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
| | - Adrien Morel
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
| | - Oscar Javier Ortega-Recalde
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
| | | | - Mariana Angulo-Aguado
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
- *Correspondence: Mariana Angulo-Aguado, ; Dora Janeth Fonseca-Mendoza,
| | - Dora Janeth Fonseca-Mendoza
- School of Medicine and Health Sciences, Center for Research in Genetics and Genomics (CIGGUR), Institute of Translational Medicine (IMT), Universidad Del Rosario, Bogotá, Colombia
- *Correspondence: Mariana Angulo-Aguado, ; Dora Janeth Fonseca-Mendoza,
| |
Collapse
|
10
|
Patton DL, Cardenas T, Mele P, Navarro J, Sung W. CDMAP/CDVIS: context-dependent mutation analysis package and visualization software. G3 (BETHESDA, MD.) 2022; 13:6887836. [PMID: 36917690 PMCID: PMC10085751 DOI: 10.1093/g3journal/jkac299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 10/17/2022] [Indexed: 12/15/2022]
Abstract
The Context-dependent Mutation Analysis Package and Visualization Software (CDMAP/CDVIS) is an automated, modular toolkit used for the analysis and visualization of context-dependent mutation patterns (site-specific variation in mutation rate from neighboring-nucleotide effects). The CDMAP computes context-dependent mutation rates using a Variant Call File (VCF), Genbank file, and reference genome and can generate high-resolution figures to analyze variation in mutation rate across spatiotemporal scales. This algorithm has been benchmarked against mutation accumulation data but can also be used to calculate context-dependent mutation rates for polymorphism or closely related species as long as the input requirements are met. Output from CDMAP can be integrated into CDVIS, an interactive database for visualizing mutation patterns across multiple taxa simultaneously.
Collapse
Affiliation(s)
- David L Patton
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC, 28223, USA
| | - Thomas Cardenas
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC, 28223, USA
| | - Perrin Mele
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC, 28223, USA
| | - Jon Navarro
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC, 28223, USA
| | - Way Sung
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC, 28223, USA
| |
Collapse
|
11
|
Hasan AR, Lachapelle J, El-Shawa SA, Potjewyd R, Ford SA, Ness RW. Salt stress alters the spectrum of de novo mutation available to selection during experimental adaptation of Chlamydomonas reinhardtii. Evolution 2022; 76:2450-2463. [PMID: 36036481 DOI: 10.1111/evo.14604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 08/12/2022] [Indexed: 01/22/2023]
Abstract
The genetic basis of adaptation is driven by both selection and the spectrum of available mutations. Given that the rate of mutation is not uniformly distributed across the genome and varies depending on the environment, understanding the signatures of selection across the genome is aided by first establishing what the expectations of genetic change are from mutation. To determine the interaction between salt stress, selection, and mutation across the genome, we compared mutations observed in a selection experiment for salt tolerance in Chlamydomonas reinhardtii to those observed in mutation accumulation (MA) experiments with and without salt exposure. MA lines evolved under salt stress had a single-nucleotide mutation rate of 1.1 × 10 - 9 $1.1 \times 10^{-9}$ , similar to that of MA lines under standard conditions ( 9.6 × 10 - 10 $9.6 \times 10^{-10}$ ). However, we found that salt stress led to an increased rate of indel mutations, but that many of these mutations were removed under selection. Finally, lines adapted to salt also showed excess clustering of mutations in the genome and the co-expression network, suggesting a role for positive selection in retaining mutations in particular compartments of the genome during the evolution of salt tolerance. Our study shows that characterizing mutation rates and spectra expected under stress helps disentangle the effects of environment and selection during adaptation.
Collapse
Affiliation(s)
- Ahmed R Hasan
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Josianne Lachapelle
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Sara A El-Shawa
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada.,Department of Mathematical and Computational Sciences, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Roman Potjewyd
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Scott A Ford
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Rob W Ness
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, M5S 3B2, Canada
| |
Collapse
|
12
|
Belinky F, Bykova A, Yurchenko V, Rogozin IB. No evidence for widespread positive selection on double substitutions within codons in primates and yeasts. Front Genet 2022; 13:991249. [PMID: 36159983 PMCID: PMC9500374 DOI: 10.3389/fgene.2022.991249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 08/29/2022] [Indexed: 11/13/2022] Open
Abstract
Nucleotide substitutions in protein-coding genes can be divided into synonymous (S) and non-synonymous (N) ones that alter amino acids (including nonsense mutations causing stop codons). The S substitutions are expected to have little effect on function. The N substitutions almost always are affected by strong purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single N substitutions and, thus, could be subjected to the positive selection. This effect has been demonstrated for mutations in the serine codons, stop codons and double N substitutions in prokaryotes. In all abovementioned cases, a novel technique was applied that allows elucidating the effects of selection on double substitutions considering mutational biases. Here, we applied the same technique to study double N substitutions in eukaryotic lineages of primates and yeast. We identified markedly fewer cases of purifying selection relative to prokaryotes and no evidence of codon double substitutions under positive selection. This is consistent with previous studies of serine codons in primates and yeast. In general, the obtained results strongly suggest that there are major differences between studied pro- and eukaryotes; double substitutions in primates and yeasts largely reflect mutational biases and are not hallmarks of selection. This is especially important in the context of detection of positive selection in codons because it has been suggested that multiple mutations in codons cause false inferences of lineage-specific site positive selection. It is likely that this concern is applicable to previously studied prokaryotes but not to primates and yeasts where markedly fewer double substitutions are affected by positive selection.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Anastassia Bykova
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
- *Correspondence: Vyacheslav Yurchenko, ; Igor B. Rogozin,
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
- *Correspondence: Vyacheslav Yurchenko, ; Igor B. Rogozin,
| |
Collapse
|
13
|
Löytynoja A. Thousands of human mutation clusters are explained by short-range template switching. Genome Res 2022; 32:1437-1447. [PMID: 35760560 PMCID: PMC9435742 DOI: 10.1101/gr.276478.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 06/21/2022] [Indexed: 02/03/2023]
Abstract
Variation within human genomes is unevenly distributed, and variants show spatial clustering. DNA replication-related template switching is a poorly known mutational mechanism capable of causing major chromosomal rearrangements as well as creating short inverted sequence copies that appear as local mutation clusters in sequence comparisons. In this study, haplotype-resolved genome assemblies representing 25 human populations and multinucleotide variants aggregated from 140,000 human sequencing experiments were reanalyzed. Local template switching could explain thousands of complex mutation clusters across the human genome, the loci segregating within and between populations. During the study, computational tools were developed for identification of template switch events using both short-read sequencing data and genotype data, and for genotyping candidate loci using short-read data. The characteristics of template-switch mutations complicate their detection, and widely used analysis pipelines for short-read sequencing data, normally capable of identifying single nucleotide changes, were found to miss template-switch mutations of tens of base pairs, potentially invalidating medical genetic studies searching for a causative allele behind genetic diseases. Combined with the massive sequencing data now available for humans, the novel tools described here enable building catalogs of affected loci and studying the cellular mechanisms behind template switching in both healthy organisms and disease.
Collapse
Affiliation(s)
- Ari Löytynoja
- Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
| |
Collapse
|
14
|
Matsen FA, Ralph PL. Enabling Inference for Context-Dependent Models of Mutation by Bounding the Propagation of Dependency. J Comput Biol 2022; 29:802-824. [PMID: 35776513 PMCID: PMC9419934 DOI: 10.1089/cmb.2021.0644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Although the rates at which positions in the genome mutate are known to depend not only on the nucleotide to be mutated, but also on neighboring nucleotides, it remains challenging to do phylogenetic inference using models of context-dependent mutation. In these models, the effects of one mutation may in principle propagate to faraway locations, making it difficult to compute exact likelihoods. This article shows how to use bounds on the propagation of dependency to compute likelihoods of mutation of a given segment of genome by marginalizing over sufficiently long flanking sequence. This can be used for maximum likelihood or Bayesian inference. Protocols examining residuals and iterative model refinement are also discussed. Tools for efficiently working with these models are provided in an R package, which could be used in other applications. The method is used to examine context dependence of mutations since the common ancestor of humans and chimpanzee.
Collapse
Affiliation(s)
- Frederick A. Matsen
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Genome Sciences, and University of Washington, Seattle, Washington, USA
- Department of Statistics, University of Washington, Seattle, Washington, USA
- Howard Hughes Medical Institute, Chevy Chase, Maryland, USA
| | - Peter L. Ralph
- Departments of Biology and Mathematics, Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
15
|
Chen DS, Clark AG, Wolfner MF. Octopaminergic/tyraminergic Tdc2 neurons regulate biased sperm usage in female Drosophila melanogaster. Genetics 2022; 221:6613932. [PMID: 35736370 DOI: 10.1093/genetics/iyac097] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/04/2022] [Indexed: 11/14/2022] Open
Abstract
In polyandrous internally fertilizing species, a multiply-mated female can use stored sperm from different males in a biased manner to fertilize her eggs. The female's ability to assess sperm quality and compatibility is essential for her reproductive success, and represents an important aspect of postcopulatory sexual selection. In Drosophila melanogaster, previous studies demonstrated that the female nervous system plays an active role in influencing progeny paternity proportion, and suggested a role for octopaminergic/tyraminergic Tdc2 neurons in this process. Here, we report that inhibiting Tdc2 neuronal activity causes females to produce a higher-than-normal proportion of first-male progeny. This difference is not due to differences in sperm storage or release, but instead is attributable to the suppression of second-male sperm usage bias that normally occurs in control females. We further show that a subset of Tdc2 neurons innervating the female reproductive tract is largely responsible for the progeny proportion phenotype that is observed when Tdc2 neurons are inhibited globally. On the contrary, overactivation of Tdc2 neurons does not further affect sperm storage and release or progeny proportion. These results suggest that octopaminergic/tyraminergic signaling allows a multiply-mated female to bias sperm usage, and identify a new role for the female nervous system in postcopulatory sexual selection.
Collapse
Affiliation(s)
- Dawn S Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Mariana F Wolfner
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| |
Collapse
|
16
|
Póti Á, Szikriszt B, Gervai JZ, Chen D, Szüts D. Characterisation of the spectrum and genetic dependence of collateral mutations induced by translesion DNA synthesis. PLoS Genet 2022; 18:e1010051. [PMID: 35130276 PMCID: PMC8870599 DOI: 10.1371/journal.pgen.1010051] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/24/2022] [Accepted: 01/21/2022] [Indexed: 11/18/2022] Open
Abstract
Translesion DNA synthesis (TLS) is a fundamental damage bypass pathway that utilises specialised polymerases with relaxed template specificity to achieve replication through damaged DNA. Misinsertions by low fidelity TLS polymerases may introduce additional mutations on undamaged DNA near the original lesion site, which we termed collateral mutations. In this study, we used whole genome sequencing datasets of chicken DT40 and several human cell lines to obtain evidence for collateral mutagenesis in higher eukaryotes. We found that cisplatin and UVC radiation frequently induce close mutation pairs within 25 base pairs that consist of an adduct-associated primary and a downstream collateral mutation, and genetically linked their formation to TLS activity involving PCNA ubiquitylation and polymerase κ. PCNA ubiquitylation was also indispensable for close mutation pairs observed amongst spontaneously arising base substitutions in cell lines with disrupted homologous recombination. Collateral mutation pairs were also found in melanoma genomes with evidence of UV exposure. We showed that collateral mutations frequently copy the upstream base, and extracted a base substitution signature that describes collateral mutagenesis in the presented dataset regardless of the primary mutagenic process. Using this mutation signature, we showed that collateral mutagenesis creates approximately 10–20% of non-paired substitutions as well, underscoring the importance of the process. DNA base substitutions are the most common form of genomic mutations, formed both spontaneously and in response to environmental mutagens. One of the main mechanisms of base substitution mutagenesis is translesion synthesis, a process that relies on specialised DNA polymerases to replicate damaged DNA templates. In addition to incorrect base insertions at the site of lesions in the template, translesion polymerases may also generate ‘collateral’ mutations away from the lesion due to their lower accuracy in selecting the correct incoming nucleotide. In this study, we surveyed the whole genome sequence of experimental cell clones to examine the extent and genetic dependence of collateral mutagenesis in higher eukaryotes. Looking for close mutation pairs, we found that collateral mutations frequently occur near primary lesions generated by cisplatin or ultraviolet radiation in chicken and human cells, but are restricted to a short distance of approximately 25 base pairs. By analysing their sequence context, we showed that collateral mutations can also occur near correctly bypassed primary lesions and may be responsible for a considerable proportion of all base substitution mutations.
Collapse
Affiliation(s)
- Ádám Póti
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Bernadett Szikriszt
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | | | - Dan Chen
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Dávid Szüts
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- * E-mail:
| |
Collapse
|
17
|
Yao Y, Sun K, Yang Q, Zhou Z, Shao C, Qian X, Tang Q, Xie J. Assessing Autosomal InDel Loci With Multiple Insertions or Deletions of Random DNA Sequences in Human Genome. Front Genet 2022; 12:809815. [PMID: 35178073 PMCID: PMC8844376 DOI: 10.3389/fgene.2021.809815] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 12/27/2021] [Indexed: 11/13/2022] Open
Abstract
Multiple mutational events of insertion/deletion occurring at or around InDel sites could form multi-allelic InDels and multi-InDels (abbreviated as MM-InDels), while InDels with random DNA sequences could imply a unique mutation event at these loci. In this study, preliminary investigation of MM-InDels with random sequences was conducted using high-throughput phased data from the 1000 Genomes Project. A total of 3,599 multi-allelic InDels and 6,375 multi-InDels were filtered with multiple alleles. A vast majority of the obtained MM-InDels (85.59%) presented 3 alleles, which implies that only one secondary insertion or deletion mutation event occurred at these loci. The more frequent presence of two adjacent InDel loci was observed within 20 bp. MM-InDels with random sequences presented an uneven distribution across the genome and showed a correlation with InDels, SNPs, recombination rate, and GC content. The average allelic frequencies and prevalence of multi-allelic InDels and multi-InDels presented similar distribution patterns in different populations. Altogether, MM-InDels with random sequences can provide useful information for population resolution.
Collapse
Affiliation(s)
- Yining Yao
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Kuan Sun
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Qinrui Yang
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Zhihan Zhou
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Chengchen Shao
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Xiaoqin Qian
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Qiqun Tang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai, China
| | - Jianhui Xie
- Department of Forensic Medicine, School of Basic Medical Sciences, Fudan University, Shanghai, China
| |
Collapse
|
18
|
Sepúlveda-Yáñez JH, Alvarez Saravia D, Pilzecker B, van Schouwenburg PA, van den Burg M, Veelken H, Navarrete MA, Jacobs H, Koning MT. Tandem Substitutions in Somatic Hypermutation. Front Immunol 2022; 12:807015. [PMID: 35069591 PMCID: PMC8781386 DOI: 10.3389/fimmu.2021.807015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/16/2021] [Indexed: 11/13/2022] Open
Abstract
Upon antigen recognition, activation-induced cytosine deaminase initiates affinity maturation of the B-cell receptor by somatic hypermutation (SHM) through error-prone DNA repair pathways. SHM typically creates single nucleotide substitutions, but tandem substitutions may also occur. We investigated incidence and sequence context of tandem substitutions by massive parallel sequencing of V(D)J repertoires in healthy human donors. Mutation patterns were congruent with SHM-derived single nucleotide mutations, delineating initiation of the tandem substitution by AID. Tandem substitutions comprised 5,7% of AID-induced mutations. The majority of tandem substitutions represents single nucleotide juxtalocations of directly adjacent sequences. These observations were confirmed in an independent cohort of healthy donors. We propose a model where tandem substitutions are predominantly generated by translesion synthesis across an apyramidinic site that is typically created by UNG. During replication, apyrimidinic sites transiently adapt an extruded configuration, causing skipping of the extruded base. Consequent strand decontraction leads to the juxtalocation, after which exonucleases repair the apyramidinic site and any directly adjacent mismatched base pairs. The mismatch repair pathway appears to account for the remainder of tandem substitutions. Tandem substitutions may enhance affinity maturation and expedite the adaptive immune response by overcoming amino acid codon degeneracies or mutating two adjacent amino acid residues simultaneously.
Collapse
Affiliation(s)
- Julieta H Sepúlveda-Yáñez
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
- School of Medicine, University of Magallanes, Punta Arenas, Chile
| | | | - Bas Pilzecker
- Department of Tumor Immunology, Radboud Institute for Molecular Life Sciences, Nijmegen, Netherlands
- Division of Tumor Biology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | | | - Mirjam van den Burg
- Department of Pediatrics, Leiden University Medical Center, Leiden, Netherlands
| | - Hendrik Veelken
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
| | | | - Heinz Jacobs
- Division of Tumor Biology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Marvyn T Koning
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
| |
Collapse
|
19
|
Lu K, Hsiao YC, Liu CW, Schoeny R, Gentry R, Starr TB. A Review of Stable Isotope Labeling and Mass Spectrometry Methods to Distinguish Exogenous from Endogenous DNA Adducts and Improve Dose-Response Assessments. Chem Res Toxicol 2021; 35:7-29. [PMID: 34910474 DOI: 10.1021/acs.chemrestox.1c00212] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Cancer remains the second most frequent cause of death in human populations worldwide, which has been reflected in the emphasis placed on management of risk from environmental chemicals considered to be potential human carcinogens. The formation of DNA adducts has been considered as one of the key events of cancer, and persistence and/or failure of repair of these adducts may lead to mutation, thus initiating cancer. Some chemical carcinogens can produce DNA adducts, and DNA adducts have been used as biomarkers of exposure. However, DNA adducts of various types are also produced endogenously in the course of normal metabolism. Since both endogenous physiological processes and exogenous exposure to xenobiotics can cause DNA adducts, the differentiation of the sources of DNA adducts can be highly informative for cancer risk assessment. This review summarizes a highly applicable methodology, termed stable isotope labeling and mass spectrometry (SILMS), that is superior to previous methods, as it not only provides absolute quantitation of DNA adducts but also differentiates the exogenous and endogenous origins of DNA adducts. SILMS uses stable isotope-labeled substances for exposure, followed by DNA adduct measurement with highly sensitive mass spectrometry. Herein, the utilities and advantage of SILMS have been demonstrated by the rich data sets generated over the last two decades in improving the risk assessment of chemicals with DNA adducts being induced by both endogenous and exogenous sources, such as formaldehyde, vinyl acetate, vinyl chloride, and ethylene oxide.
Collapse
Affiliation(s)
- Kun Lu
- Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Yun-Chung Hsiao
- Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Chih-Wei Liu
- Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Rita Schoeny
- Rita Schoeny LLC, 726 Fifth Street NE, Washington, D.C. 20002, United States
| | - Robinan Gentry
- Ramboll US Consulting, Inc., Monroe, Louisiana 71201, United States
| | - Thomas B Starr
- Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States.,TBS Associates, 7500 Rainwater Road, Raleigh, North Carolina 27615, United States
| |
Collapse
|
20
|
Zverinova S, Guryev V. Variant calling: Considerations, practices, and developments. Hum Mutat 2021; 43:976-985. [PMID: 34882898 PMCID: PMC9545713 DOI: 10.1002/humu.24311] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 11/02/2021] [Accepted: 12/03/2021] [Indexed: 11/10/2022]
Abstract
The success of many clinical, association, or population genetics studies critically relies on properly performed variant calling step. The variety of modern genomics protocols, techniques, and platforms makes our choices of methods and algorithms difficult and there is no "one size fits all" solution for study design and data analysis. In this review, we discuss considerations that need to be taken into account while designing the study and preparing for the experiments. We outline the variety of variant types that can be detected using sequencing approaches and highlight some specific requirements and basic principles of their detection. Finally, we cover interesting developments that enable variant calling for a broad range of applications in the genomics field. We conclude by discussing technological and algorithmic advances that have the potential to change the ways of calling DNA variants in the nearest future.
Collapse
Affiliation(s)
- Stepanka Zverinova
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, The Netherlands
| |
Collapse
|
21
|
Protein innovation through template switching in the Saccharomyces cerevisiae lineage. Sci Rep 2021; 11:22558. [PMID: 34799587 PMCID: PMC8604942 DOI: 10.1038/s41598-021-01736-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 10/27/2021] [Indexed: 11/08/2022] Open
Abstract
DNA polymerase template switching between short, non-identical inverted repeats (IRs) is a genetic mechanism that leads to the homogenization of IR arms and to IR spacer inversion, which cause multinucleotide mutations (MNMs). It is unknown if and how template switching affects gene evolution. In this study, we performed a phylogenetic analysis to determine the effect of template switching between IR arms on coding DNA of Saccharomyces cerevisiae. To achieve this, perfect IRs that co-occurred with MNMs between a strain and its parental node were identified in S. cerevisiae strains. We determined that template switching introduced MNMs into 39 protein-coding genes through S. cerevisiae evolution, resulting in both arm homogenization and inversion of the IR spacer. These events in turn resulted in nonsynonymous substitutions and up to five neighboring amino acid replacements in a single gene. The study demonstrates that template switching is a powerful generator of multiple substitutions within codons. Additionally, some template switching events occurred more than once during S. cerevisiae evolution. Our findings suggest that template switching constitutes a general mutagenic mechanism that results in both nonsynonymous substitutions and parallel evolution, which are traditionally considered as evidence for positive selection, without the need for adaptive explanations.
Collapse
|
22
|
Jiang P, Ollodart AR, Sudhesh V, Herr AJ, Dunham MJ, Harris K. A modified fluctuation assay reveals a natural mutator phenotype that drives mutation spectrum variation within Saccharomyces cerevisiae. eLife 2021; 10:68285. [PMID: 34523420 PMCID: PMC8497059 DOI: 10.7554/elife.68285] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 09/14/2021] [Indexed: 12/23/2022] Open
Abstract
Although studies of Saccharomyces cerevisiae have provided many insights into mutagenesis and DNA repair, most of this work has focused on a few laboratory strains. Much less is known about the phenotypic effects of natural variation within S. cerevisiae’s DNA repair pathways. Here, we use natural polymorphisms to detect historical mutation spectrum differences among several wild and domesticated S. cerevisiae strains. To determine whether these differences are likely caused by genetic mutation rate modifiers, we use a modified fluctuation assay with a CAN1 reporter to measure de novo mutation rates and spectra in 16 of the analyzed strains. We measure a 10-fold range of mutation rates and identify two strains with distinctive mutation spectra. These strains, known as AEQ and AAR, come from the panel’s ‘Mosaic beer’ clade and share an enrichment for C > A mutations that is also observed in rare variation segregating throughout the genomes of several Mosaic beer and Mixed origin strains. Both AEQ and AAR are haploid derivatives of the diploid natural isolate CBS 1782, whose rare polymorphisms are enriched for C > A as well, suggesting that the underlying mutator allele is likely active in nature. We use a plasmid complementation test to show that AAR and AEQ share a mutator allele in the DNA repair gene OGG1, which excises 8-oxoguanine lesions that can cause C > A mutations if left unrepaired.
Collapse
Affiliation(s)
- Pengyao Jiang
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Anja R Ollodart
- Department of Genome Sciences, University of Washington, Seattle, United States.,Molecular and Cellular Biology Program, University of Washington, Seattle, United States
| | - Vidha Sudhesh
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Alan J Herr
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, United States
| | - Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States.,Department of Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, United States
| |
Collapse
|
23
|
Bohutínská M, Handrick V, Yant L, Schmickl R, Kolář F, Bomblies K, Paajanen P. De Novo Mutation and Rapid Protein (Co-)evolution during Meiotic Adaptation in Arabidopsis arenosa. Mol Biol Evol 2021; 38:1980-1994. [PMID: 33502506 PMCID: PMC8097281 DOI: 10.1093/molbev/msab001] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A sudden shift in environment or cellular context necessitates rapid adaptation. A dramatic example is genome duplication, which leads to polyploidy. In such situations, the waiting time for new mutations might be prohibitive; theoretical and empirical studies suggest that rapid adaptation will largely rely on standing variation already present in source populations. Here, we investigate the evolution of meiosis proteins in Arabidopsis arenosa, some of which were previously implicated in adaptation to polyploidy, and in a diploid, habitat. A striking and unexplained feature of prior results was the large number of amino acid changes in multiple interacting proteins, especially in the relatively young tetraploid. Here, we investigate whether selection on meiosis genes is found in other lineages, how the polyploid may have accumulated so many differences, and whether derived variants were selected from standing variation. We use a range-wide sample of 145 resequenced genomes of diploid and tetraploid A. arenosa, with new genome assemblies. We confirmed signals of positive selection in the polyploid and diploid lineages they were previously reported in and find additional meiosis genes with evidence of selection. We show that the polyploid lineage stands out both qualitatively and quantitatively. Compared with diploids, meiosis proteins in the polyploid have more amino acid changes and a higher proportion affecting more strongly conserved sites. We find evidence that in tetraploids, positive selection may have commonly acted on de novo mutations. Several tests provide hints that coevolution, and in some cases, multinucleotide mutations, might contribute to rapid accumulation of changes in meiotic proteins.
Collapse
Affiliation(s)
- Magdalena Bohutínská
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Vinzenz Handrick
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Levi Yant
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Roswitha Schmickl
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Filip Kolář
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic.,Department of Botany, University of Innsbruck, Innsbruck, Austria
| | - Kirsten Bomblies
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom.,Plant Evolutionary Genetics, Department of Biology, Institute of Molecular Plant Biology, ETH Zürich, Zurich, Switzerland
| | - Pirita Paajanen
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| |
Collapse
|
24
|
Norn C, André I, Theobald DL. A thermodynamic model of protein structure evolution explains empirical amino acid substitution matrices. Protein Sci 2021; 30:2057-2068. [PMID: 34218472 PMCID: PMC8442976 DOI: 10.1002/pro.4155] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 06/25/2021] [Accepted: 06/29/2021] [Indexed: 12/30/2022]
Abstract
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.
Collapse
Affiliation(s)
- Christoffer Norn
- Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Ingemar André
- Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Douglas L Theobald
- Biochemistry Department, Brandeis University, Waltham, Massachusetts, USA
| |
Collapse
|
25
|
Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes. PLoS One 2021; 16:e0248337. [PMID: 33711070 PMCID: PMC7954308 DOI: 10.1371/journal.pone.0248337] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/24/2021] [Indexed: 01/03/2023] Open
Abstract
Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.
Collapse
|
26
|
Walker CR, Scally A, De Maio N, Goldman N. Short-range template switching in great ape genomes explored using pair hidden Markov models. PLoS Genet 2021; 17:e1009221. [PMID: 33651813 PMCID: PMC7954356 DOI: 10.1371/journal.pgen.1009221] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/12/2021] [Accepted: 02/10/2021] [Indexed: 12/14/2022] Open
Abstract
Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes’ genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons. DNA replication is an imperfect process which causes the mutations that give rise to genetic diversity during the evolution of genomes. While many mutations are independent, single-nucleotide substitutions or small insertions and deletions, some mutations arise as nonindependent clusters of substitutions and larger scale chromosomal rearrangements. Large-scale rearrangements (also called structural variants) in particular can have a profound impact on genome evolution and contribute to both germline and somatic disease in humans. The replication-based mechanisms underlying structural variation typically involve a polymerase switch event in which a large segment of DNA is copied using a template from an alternate location in the genome. Methods for identifying these template switch mutations lack the power to detect smaller scale rearrangements which can arise through the same replication-based pathways. Here we outline a model which can detect and assess the statistical significance of such small-scale template switches within their evolutionary context. We show that these events are widespread in the evolution of great apes and that the genomic features associated with these small-scale rearrangements are similar to those of large-scale structural variants.
Collapse
Affiliation(s)
- Conor R. Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- * E-mail:
| |
Collapse
|
27
|
Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, Torres R, Taliun SAG, Corvelo A, Gogarten SM, Kang HM, Pitsillides AN, LeFaive J, Lee SB, Tian X, Browning BL, Das S, Emde AK, Clarke WE, Loesch DP, Shetty AC, Blackwell TW, Smith AV, Wong Q, Liu X, Conomos MP, Bobo DM, Aguet F, Albert C, Alonso A, Ardlie KG, Arking DE, Aslibekyan S, Auer PL, Barnard J, Barr RG, Barwick L, Becker LC, Beer RL, Benjamin EJ, Bielak LF, Blangero J, Boehnke M, Bowden DW, Brody JA, Burchard EG, Cade BE, Casella JF, Chalazan B, Chasman DI, Chen YDI, Cho MH, Choi SH, Chung MK, Clish CB, Correa A, Curran JE, Custer B, Darbar D, Daya M, de Andrade M, DeMeo DL, Dutcher SK, Ellinor PT, Emery LS, Eng C, Fatkin D, Fingerlin T, Forer L, Fornage M, Franceschini N, Fuchsberger C, Fullerton SM, Germer S, Gladwin MT, Gottlieb DJ, Guo X, Hall ME, He J, Heard-Costa NL, Heckbert SR, Irvin MR, Johnsen JM, Johnson AD, Kaplan R, Kardia SLR, Kelly T, Kelly S, Kenny EE, Kiel DP, Klemmer R, Konkle BA, Kooperberg C, Köttgen A, Lange LA, Lasky-Su J, Levy D, Lin X, Lin KH, Liu C, Loos RJF, Garman L, Gerszten R, Lubitz SA, Lunetta KL, Mak ACY, Manichaikul A, Manning AK, Mathias RA, McManus DD, McGarvey ST, Meigs JB, Meyers DA, Mikulla JL, Minear MA, Mitchell BD, Mohanty S, Montasser ME, Montgomery C, Morrison AC, Murabito JM, Natale A, Natarajan P, Nelson SC, North KE, O'Connell JR, Palmer ND, Pankratz N, Peloso GM, Peyser PA, Pleiness J, Post WS, Psaty BM, Rao DC, Redline S, Reiner AP, Roden D, Rotter JI, Ruczinski I, Sarnowski C, Schoenherr S, Schwartz DA, Seo JS, Seshadri S, Sheehan VA, Sheu WH, Shoemaker MB, Smith NL, Smith JA, Sotoodehnia N, Stilp AM, Tang W, Taylor KD, Telen M, Thornton TA, Tracy RP, Van Den Berg DJ, Vasan RS, Viaud-Martinez KA, Vrieze S, Weeks DE, Weir BS, Weiss ST, Weng LC, Willer CJ, Zhang Y, Zhao X, Arnett DK, Ashley-Koch AE, Barnes KC, Boerwinkle E, Gabriel S, Gibbs R, Rice KM, Rich SS, Silverman EK, Qasba P, Gan W, Papanicolaou GJ, Nickerson DA, Browning SR, Zody MC, Zöllner S, Wilson JG, Cupples LA, Laurie CC, Jaquish CE, Hernandez RD, O'Connor TD, Abecasis GR. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 2021; 590:290-299. [PMID: 33568819 PMCID: PMC7875770 DOI: 10.1038/s41586-021-03205-y] [Citation(s) in RCA: 965] [Impact Index Per Article: 321.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 01/07/2021] [Indexed: 02/08/2023]
Abstract
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
Collapse
Affiliation(s)
- Daniel Taliun
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Daniel N Harris
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Michael D Kessler
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jedidiah Carlson
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Zachary A Szpiech
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA, USA
| | - Raul Torres
- Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Sarah A Gagliano Taliun
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | | | - Hyun Min Kang
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | - Jonathon LeFaive
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Seung-Been Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaowen Tian
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Brian L Browning
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA, USA
| | - Sayantan Das
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | | | | | - Douglas P Loesch
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Amol C Shetty
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Thomas W Blackwell
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Albert V Smith
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Quenna Wong
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Xiaoming Liu
- USF Genomics, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Dean M Bobo
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Alvaro Alonso
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | | | - Dan E Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | - Paul L Auer
- Zilber School of Public Health, University of Wisconsin Milwaukee, Milwaukee, WI, USA
| | | | - R Graham Barr
- Department of Medicine, Columbia University Medical Center, New York, NY, USA
- Department of Epidemiology, Columbia University Medical Center, New York, NY, USA
| | | | | | - Rebecca L Beer
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emelia J Benjamin
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Lawrence F Bielak
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - John Blangero
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Michael Boehnke
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Jennifer A Brody
- Department of Medicine, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - Esteban G Burchard
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Brian E Cade
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - James F Casella
- Department of Pediatrics, Johns Hopkins University, Baltimore, MD, USA
- Division of Pediatric Hematology, Johns Hopkins University, Baltimore, MD, USA
| | - Brandon Chalazan
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Mina K Chung
- Department of Cardiovascular Medicine, Heart & Vascular Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Clary B Clish
- Metabolomics Platform, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Adolfo Correa
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
- Department of Pediatrics, University of Mississippi Medical Center, Jackson, MS, USA
- Department of Population Health Science, University of Mississippi Medical Center, Jackson, MS, USA
| | - Joanne E Curran
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Brian Custer
- Vitalant Research Institute, San Francisco, CA, USA
- Department of Laboratory Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Dawood Darbar
- Department of Medicine, University of Illinois at Chicago, Chicago, IL, USA
| | - Michelle Daya
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University, St Louis, MO, USA
- Department of Genetics, Washington University, St Louis, MO, USA
| | - Patrick T Ellinor
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Leslie S Emery
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Diane Fatkin
- Molecular Cardiology Division, Victor Chang Cardiac Research Institute, Darlinghurst, New South Wales, Australia
- Faculty of Medicine, University of New South Wales, Kensington, New South Wales, Australia
- Cardiology Department, St Vincent's Hospital, Darlinghurst, New South Wales, Australia
| | - Tasha Fingerlin
- National Jewish Health, Center for Genes, Environment and Health, Denver, CO, USA
| | - Lukas Forer
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Myriam Fornage
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Christian Fuchsberger
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
- Institute for Biomedicine, Eurac Research, Bolzano, Italy
| | - Stephanie M Fullerton
- Department of Bioethics & Humanities, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Mark T Gladwin
- Pittsburgh Heart, Lung, Blood and Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, USA
- Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Daniel J Gottlieb
- VA Boston Healthcare System, Boston, MA, USA
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Michael E Hall
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jiang He
- Department of Epidemiology, Tulane University, New Orleans, LA, USA
- Tulane University Translational Science Institute, Tulane University, New Orleans, LA, USA
| | - Nancy L Heard-Costa
- Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | - Susan R Heckbert
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jill M Johnsen
- Department of Medicine, University of Washington, Seattle, WA, USA
- Bloodworks Northwest Research Institute, Seattle, WA, USA
| | - Andrew D Johnson
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA, USA
| | - Robert Kaplan
- Albert Einstein College of Medicine, New York, NY, USA
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Tanika Kelly
- Department of Epidemiology, Tulane University, New Orleans, LA, USA
| | - Shannon Kelly
- Department of Epidemiology, Vitalant Research Institute, San Francisco, CA, USA
- Department of Pediatrics, UCSF Benioff Children's Hospital, Oakland, CA, USA
- Division of Pediatric Hematology, UCSF Benioff Children's Hospital, Oakland, CA, USA
| | - Eimear E Kenny
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Douglas P Kiel
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Hinda and Arthur Marcus Institute for Aging Research, Hebrew SeniorLife, Boston, MA, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Robert Klemmer
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Barbara A Konkle
- Department of Medicine, University of Washington, Seattle, WA, USA
- Bloodworks Northwest Research Institute, Seattle, WA, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Anna Köttgen
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD, USA
- Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Leslie A Lange
- Department of Medicine, University of Colorado at Denver, Aurora, CO, USA
| | - Jessica Lasky-Su
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Daniel Levy
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA, USA
| | - Xihong Lin
- Biostatistics and Statistics, Harvard University, Boston, MA, USA
| | - Keng-Han Lin
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lori Garman
- Department of Genes and Human Disease, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | | | | | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Angel C Y Mak
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Alisa K Manning
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
- Metabolism Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - David D McManus
- Cardiovascular Medicine, University of Massachusetts Medical School, Worcester, MA, USA
| | - Stephen T McGarvey
- International Health Institute, Brown University, Providence, RI, USA
- Department of Epidemiology, Brown University, Providence, RI, USA
- Department of Anthropology, Brown University, Providence, RI, USA
| | - James B Meigs
- Division of General Internal Medicine, Massachusetts General Hospital, Harvard Medical School, The Broad Institute of MIT and Harvard, Boston, MA, USA
| | | | - Julie L Mikulla
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mollie A Minear
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Braxton D Mitchell
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Sanghamitra Mohanty
- Texas Cardiac Arrhythmia Institute, St David's Medical Center, Austin, TX, USA
- Department of Internal Medicine, Dell Medical School, Austin, TX, USA
| | - May E Montasser
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Courtney Montgomery
- Department of Genes and Human Disease, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Joanne M Murabito
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Andrea Natale
- Texas Cardiac Arrhythmia Institute, St David's Medical Center, Austin, TX, USA
| | - Pradeep Natarajan
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Sarah C Nelson
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Jeffrey R O'Connell
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Nathan Pankratz
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Patricia A Peyser
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Jacob Pleiness
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Wendy S Post
- Division of Cardiology, Department of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Bruce M Psaty
- Department of Medicine, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Services, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - D C Rao
- Division of Biostatistics, Washington University in St Louis, St Louis, MO, USA
| | - Susan Redline
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Dan Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Chloé Sarnowski
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Sebastian Schoenherr
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | | | - Jeong-Sun Seo
- Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
- Macrogen Inc, Seoul, Republic of Korea
- Gong Wu Genomic Medicine Institute, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Sudha Seshadri
- Framingham Heart Study, Framingham, MA, USA
- Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Sciences Center at San Antonio, San Antonio, TX, USA
| | - Vivien A Sheehan
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA, USA
- Aflac Cancer and Blood Disorders Center, Children's Healthcare of Atlanta, Atlanta, GA, USA
| | - Wayne H Sheu
- Taichung Veterans General Hospital Taiwan, Taichung City, Taiwan
| | | | - Nicholas L Smith
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
- Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA, USA
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - Adrienne M Stilp
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Weihong Tang
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA, USA
| | | | | | - Russell P Tracy
- Department of Pathology & Laboratory Medicine, University of Vermont Larner College of Medicine, Burlington, VT, USA
| | - David J Van Den Berg
- Center for Genetic Epidemiology, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Ramachandran S Vasan
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | | | - Scott Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Daniel E Weeks
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Scott T Weiss
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | | | - Cristen J Willer
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Internal Medicine-Cardiology, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Yingze Zhang
- Pittsburgh Heart, Lung, Blood and Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, USA
- Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Xutong Zhao
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Donna K Arnett
- Department of Epidemiology, University of Kentucky, Lexington, KY, USA
| | - Allison E Ashley-Koch
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
| | - Kathleen C Barnes
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Eric Boerwinkle
- University of Texas Health Science Center at Houston, Houston, TX, USA
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Stacey Gabriel
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Richard Gibbs
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Pankaj Qasba
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Weiniu Gan
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - George J Papanicolaou
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Northwest Genomics Center, Seattle, WA, USA
- Brotman Baty Institute, Seattle, WA, USA
| | - Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | | | - Sebastian Zöllner
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
- Framingham Heart Study, Framingham, MA, USA.
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| | - Cashell E Jaquish
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA.
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Timothy D O'Connor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
- Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Gonçalo R Abecasis
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
| |
Collapse
|
28
|
Jones CT, Youssef N, Susko E, Bielawski JP. A Phenotype-Genotype Codon Model for Detecting Adaptive Evolution. Syst Biol 2021; 69:722-738. [PMID: 31730199 DOI: 10.1093/sysbio/syz075] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 11/09/2019] [Accepted: 11/11/2019] [Indexed: 01/03/2023] Open
Abstract
A central objective in biology is to link adaptive evolution in a gene to structural and/or functional phenotypic novelties. Yet most analytic methods make inferences mainly from either phenotypic data or genetic data alone. A small number of models have been developed to infer correlations between the rate of molecular evolution and changes in a discrete or continuous life history trait. But such correlations are not necessarily evidence of adaptation. Here, we present a novel approach called the phenotype-genotype branch-site model (PG-BSM) designed to detect evidence of adaptive codon evolution associated with discrete-state phenotype evolution. An episode of adaptation is inferred under standard codon substitution models when there is evidence of positive selection in the form of an elevation in the nonsynonymous-to-synonymous rate ratio $\omega$ to a value $\omega > 1$. As it is becoming increasingly clear that $\omega > 1$ can occur without adaptation, the PG-BSM was formulated to infer an instance of adaptive evolution without appealing to evidence of positive selection. The null model makes use of a covarion-like component to account for general heterotachy (i.e., random changes in the evolutionary rate at a site over time). The alternative model employs samples of the phenotypic evolutionary history to test for phenomenological patterns of heterotachy consistent with specific mechanisms of molecular adaptation. These include 1) a persistent increase/decrease in $\omega$ at a site following a change in phenotype (the pattern) consistent with an increase/decrease in the functional importance of the site (the mechanism); and 2) a transient increase in $\omega$ at a site along a branch over which the phenotype changed (the pattern) consistent with a change in the site's optimal amino acid (the mechanism). Rejection of the null is followed by post hoc analyses to identify sites with strongest evidence for adaptation in association with changes in the phenotype as well as the most likely evolutionary history of the phenotype. Simulation studies based on a novel method for generating mechanistically realistic signatures of molecular adaptation show that the PG-BSM has good statistical properties. Analyses of real alignments show that site patterns identified post hoc are consistent with the specific mechanisms of adaptation included in the alternate model. Further simulation studies show that the covarion-like component of the PG-BSM plays a crucial role in mitigating recently discovered statistical pathologies associated with confounding by accounting for heterotachy-by-any-cause. [Adaptive evolution; branch-site model; confounding; mutation-selection; phenotype-genotype.].
Collapse
Affiliation(s)
- Christopher T Jones
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Noor Youssef
- Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Joseph P Bielawski
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| |
Collapse
|
29
|
Srivastava K, Doescher A, Wagner FF, Flegel WA. NG_007494.1(RHD):c.[4A>T;5G>C;6_7insG] with an RhD-negative phenotype. Transfusion 2020; 60:E45-E47. [PMID: 33043462 DOI: 10.1111/trf.16115] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2020] [Revised: 07/17/2020] [Accepted: 07/19/2020] [Indexed: 12/01/2022]
Affiliation(s)
- Kshitij Srivastava
- Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, Maryland, USA
| | - Andrea Doescher
- DRK Blutspendedienst NSTOB, Institutes Springe and Bremen-Oldenburg, Springe, Germany
| | - Franz F Wagner
- DRK Blutspendedienst NSTOB, Institutes Springe and Bremen-Oldenburg, Springe, Germany
| | - Willy A Flegel
- Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
30
|
Mas-Ponte D, Supek F. DNA mismatch repair promotes APOBEC3-mediated diffuse hypermutation in human cancers. Nat Genet 2020; 52:958-968. [PMID: 32747826 PMCID: PMC7610516 DOI: 10.1038/s41588-020-0674-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 06/30/2020] [Indexed: 01/12/2023]
Abstract
Certain mutagens, including the APOBEC3 (A3) cytosine deaminase enzymes, can create multiple genetic changes in a single event. Activity of A3s results in striking 'mutation showers' occurring near DNA breakpoints; however, less is known about the mechanisms underlying the majority of A3 mutations. We classified the diverse patterns of clustered mutagenesis in tumor genomes, which identified a new A3 pattern: nonrecurrent, diffuse hypermutation (omikli). This mechanism occurs independently of the known focal hypermutation (kataegis), and is associated with activity of the DNA mismatch-repair pathway, which can provide the single-stranded DNA substrate needed by A3, and contributes to a substantial proportion of A3 mutations genome wide. Because mismatch repair is directed towards early-replicating, gene-rich chromosomal domains, A3 mutagenesis has a high propensity to generate impactful mutations, which exceeds that of other common carcinogens such as tobacco smoke and ultraviolet exposure. Cells direct their DNA repair capacity towards more important genomic regions; thus, carcinogens that subvert DNA repair can be remarkably potent.
Collapse
Affiliation(s)
- David Mas-Ponte
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
31
|
Wang Q, Pierce-Hoffman E, Cummings BB, Alföldi J, Francioli LC, Gauthier LD, Hill AJ, O'Donnell-Luria AH, Karczewski KJ, MacArthur DG. Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes. Nat Commun 2020; 11:2539. [PMID: 32461613 PMCID: PMC7253413 DOI: 10.1038/s41467-019-12438-5] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
Collapse
Affiliation(s)
- Qingbo Wang
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, 02115, USA
| | - Emma Pierce-Hoffman
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Beryl B Cummings
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Biomedical and Biological Sciences, Harvard Medical School, Boston, MA, 02115, USA
| | - Jessica Alföldi
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Laurent C Francioli
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Laura D Gauthier
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Andrew J Hill
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Anne H O'Donnell-Luria
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA.
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Australia.
| |
Collapse
|
32
|
Li C, Luscombe NM. Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome. Nat Commun 2020; 11:1363. [PMID: 32170069 PMCID: PMC7070026 DOI: 10.1038/s41467-020-15185-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 02/23/2020] [Indexed: 02/08/2023] Open
Abstract
Nucleosome organization has been suggested to affect local mutation rates in the genome. However, the lack of de novo mutation and high-resolution nucleosome data has limited the investigation of this hypothesis. Additionally, analyses using indirect mutation rate measurements have yielded contradictory and potentially confounding results. Here, we combine data on >300,000 human de novo mutations with high-resolution nucleosome maps and find substantially elevated mutation rates around translationally stable (‘strong’) nucleosomes. We show that the mutational mechanisms affected by strong nucleosomes are low-fidelity replication, insufficient mismatch repair and increased double-strand breaks. Strong nucleosomes preferentially locate within young SINE/LINE transposons, suggesting that when subject to increased mutation rates, transposons are then more rapidly inactivated. Depletion of strong nucleosomes in older transposons suggests frequent positioning changes during evolution. The findings have important implications for human genetics and genome evolution. Nucleosome organization has been suggested to affect local mutation rates in the genome. Here, the authors analyse data on >300,000 human de novo mutations and high-resolution nucleosome maps and provide evidence that nucleosome positioning stability modulates germline mutation rate variation across the human genome.
Collapse
Affiliation(s)
- Cai Li
- The Francis Crick Institute, London, NW1 1AT, UK. .,School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, NW1 1AT, UK.,Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan.,UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| |
Collapse
|
33
|
The Tempo and Mode of Angiosperm Mitochondrial Genome Divergence Inferred from Intraspecific Variation in Arabidopsis thaliana. G3-GENES GENOMES GENETICS 2020; 10:1077-1086. [PMID: 31964685 PMCID: PMC7056966 DOI: 10.1534/g3.119.401023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The mechanisms of sequence divergence in angiosperm mitochondrial genomes have long been enigmatic. In particular, it is difficult to reconcile the rapid divergence of intergenic regions that can make non-coding sequences almost unrecognizable even among close relatives with the unusually high levels of sequence conservation found in genic regions. It has been hypothesized that different mutation and repair mechanisms act on genic and intergenic sequences or alternatively that mutational input is relatively constant but that selection has strikingly different effects on these respective regions. To test these alternative possibilities, we analyzed mtDNA divergence within Arabidopsis thaliana, including variants from the 1001 Genomes Project and changes accrued in published mutation accumulation (MA) lines. We found that base-substitution frequencies are relatively similar for intergenic regions and synonymous sites in coding regions, whereas indel and nonsynonymous substitutions rates are greatly depressed in coding regions, supporting a conventional model in which mutation/repair mechanisms are consistent throughout the genome but differentially filtered by selection. Most types of sequence and structural changes were undetectable in 10-generation MA lines, but we found significant shifts in relative copy number across mtDNA regions for lines grown under stressed vs. benign conditions. We confirmed quantitative variation in copy number across the A. thaliana mitogenome using both whole-genome sequencing and droplet digital PCR, further undermining the classic but oversimplified model of a circular angiosperm mtDNA structure. Our results suggest that copy number variation is one of the most fluid features of angiosperm mitochondrial genomes.
Collapse
|
34
|
Satoh Y, Asakawa JI, Nishimura M, Kuo T, Shinkai N, Cullings HM, Minakuchi Y, Sese J, Toyoda A, Shimada Y, Nakamura N, Uchimura A. Characteristics of induced mutations in offspring derived from irradiated mouse spermatogonia and mature oocytes. Sci Rep 2020; 10:37. [PMID: 31913321 PMCID: PMC6949229 DOI: 10.1038/s41598-019-56881-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 12/18/2019] [Indexed: 01/07/2023] Open
Abstract
The exposure of germ cells to radiation introduces mutations in the genomes of offspring, and a previous whole-genome sequencing study indicated that the irradiation of mouse sperm induces insertions/deletions (indels) and multisite mutations (clustered single nucleotide variants and indels). However, the current knowledge on the mutation spectra is limited, and the effects of radiation exposure on germ cells at stages other than the sperm stage remain unknown. Here, we performed whole-genome sequencing experiments to investigate the exposure of spermatogonia and mature oocytes. We compared de novo mutations in a total of 24 F1 mice conceived before and after the irradiation of their parents. The results indicated that radiation exposure, 4 Gy of gamma rays, induced 9.6 indels and 2.5 multisite mutations in spermatogonia and 4.7 indels and 3.1 multisite mutations in mature oocytes in the autosomal regions of each F1 individual. Notably, we found two types of deletions, namely, small deletions (mainly 1~12 nucleotides) in non-repeat sequences, many of which showed microhomology at the breakpoint junction, and single-nucleotide deletions in mononucleotide repeat sequences. The results suggest that these deletions and multisite mutations could be a typical signature of mutations induced by parental irradiation in mammals.
Collapse
Affiliation(s)
- Yasunari Satoh
- Department of Molecular Biosciences, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan.
| | - Jun-Ichi Asakawa
- Department of Molecular Biosciences, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan
| | - Mayumi Nishimura
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, 263-8555, Japan
| | - Tony Kuo
- Artificial Intelligence Research Center, AIST, 2-3-26 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,Real World Big-Data Computation Open Innovation Laboratory, AIST-Tokyo Tech, 2-12-1 Okayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Norio Shinkai
- Artificial Intelligence Research Center, AIST, 2-3-26 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Harry M Cullings
- Department of Statistics, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan
| | - Yohei Minakuchi
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, 411-8540, Japan
| | - Jun Sese
- Artificial Intelligence Research Center, AIST, 2-3-26 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,Real World Big-Data Computation Open Innovation Laboratory, AIST-Tokyo Tech, 2-12-1 Okayama, Meguro-ku, Tokyo, 152-8550, Japan.,Humanome Lab, Inc., L-HUB 3F, 1-4, Shumomiyabi-cho, Sinjuku-ku, Tokyo, 162-0822, Japan
| | - Atsushi Toyoda
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, 411-8540, Japan
| | - Yoshiya Shimada
- Department of Radiological Sciences, Graduate School of Human Health Sciences, Tokyo Metropolitan University, Tokyo, 116-8551, Japan.,Executive Director, QST, Chiba, 263-8555, Japan
| | - Nori Nakamura
- Department of Molecular Biosciences, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan
| | - Arikuni Uchimura
- Department of Molecular Biosciences, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan.
| |
Collapse
|
35
|
Belinky F, Sela I, Rogozin IB, Koonin EV. Crossing fitness valleys via double substitutions within codons. BMC Biol 2019; 17:105. [PMID: 31842858 PMCID: PMC6916188 DOI: 10.1186/s12915-019-0727-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 11/20/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
36
|
Gagunashvili AN, Ocaka L, Kelberman D, Munot P, Bacchelli C, Beales PL, Ganesan V. Novel missense variants in the RNF213 gene from a European family with Moyamoya disease. Hum Genome Var 2019; 6:35. [PMID: 31645973 PMCID: PMC6804521 DOI: 10.1038/s41439-019-0066-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/06/2019] [Accepted: 06/21/2019] [Indexed: 01/30/2023] Open
Abstract
In this report, we present a European family with six individuals affected with Moyamoya disease (MMD). We detected two novel missense variants in the Moyamoya susceptibility gene RNF213, c.12553A>G (p.(Lys4185Glu)) and c.12562G>A (p.(Ala4188Thr)). Cosegregation of the variants with MMD, as well as a previous report of a variant affecting the same amino acid residue in unrelated MMD patients, supports the role of RNF213 in the pathogenesis of MMD.
Collapse
Affiliation(s)
- Andrey N Gagunashvili
- 1GOSgene, Genetics and Genomic Medicine, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Louise Ocaka
- 1GOSgene, Genetics and Genomic Medicine, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Daniel Kelberman
- 1GOSgene, Genetics and Genomic Medicine, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Pinki Munot
- 2Neurology Department, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
| | - Chiara Bacchelli
- 1GOSgene, Genetics and Genomic Medicine, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Philip L Beales
- 1GOSgene, Genetics and Genomic Medicine, UCL Great Ormond Street Institute of Child Health, London, UK
| | - Vijeya Ganesan
- 2Neurology Department, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK.,3Clinical Neurosciences, UCL Great Ormond Street Institute of Child Health, London, UK
| |
Collapse
|
37
|
Kaplanis J, Akawi N, Gallone G, McRae JF, Prigmore E, Wright CF, Fitzpatrick DR, Firth HV, Barrett JC, Hurles ME. Exome-wide assessment of the functional impact and pathogenicity of multinucleotide mutations. Genome Res 2019; 29:1047-1056. [PMID: 31227601 PMCID: PMC6633265 DOI: 10.1101/gr.239756.118] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 05/24/2019] [Indexed: 01/25/2023]
Abstract
Approximately 2% of de novo single-nucleotide variants (SNVs) appear as part of clustered mutations that create multinucleotide variants (MNVs). MNVs are an important source of genomic variability as they are more likely to alter an encoded protein than a SNV, which has important implications in disease as well as evolution. Previous studies of MNVs have focused on their mutational origins and have not systematically evaluated their functional impact and contribution to disease. We identified 69,940 MNVs and 91 de novo MNVs in 6688 exome-sequenced parent–offspring trios from the Deciphering Developmental Disorders Study comprising families with severe developmental disorders. We replicated the previously described MNV mutational signatures associated with DNA polymerase zeta, an error-prone translesion polymerase, and the APOBEC family of DNA deaminases. We estimate the simultaneous MNV germline mutation rate to be 1.78 × 10−10 mutations per base pair per generation. We found that most MNVs within a single codon create a missense change that could not have been created by a SNV. MNV-induced missense changes were, on average, more physicochemically divergent, were more depleted in highly constrained genes (pLI ≥ 0.9), and were under stronger purifying selection compared with SNV-induced missense changes. We found that de novo MNVs were significantly enriched in genes previously associated with developmental disorders in affected children. This shows that MNVs can be more damaging than SNVs even when both induce missense changes, and are an important variant type to consider in relation to human disease.
Collapse
Affiliation(s)
- Joanna Kaplanis
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Nadia Akawi
- Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DU, United Kingdom
| | - Giuseppe Gallone
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Jeremy F McRae
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Elena Prigmore
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Caroline F Wright
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, EX2 5DW, United Kingdom
| | - David R Fitzpatrick
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
| | - Helen V Firth
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom.,Department of Clinical Genetics, Cambridge University Hospitals NHS Foundation Trust, Cambridge, CB2 0QQ, United Kingdom
| | - Jeffrey C Barrett
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Matthew E Hurles
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | | |
Collapse
|
38
|
Prendergast JGD, Pugh C, Harris SE, Hume DA, Deary IJ, Beveridge A. Linked Mutations at Adjacent Nucleotides Have Shaped Human Population Differentiation and Protein Evolution. Genome Biol Evol 2019; 11:759-775. [PMID: 30689878 PMCID: PMC6424222 DOI: 10.1093/gbe/evz014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/18/2019] [Indexed: 02/06/2023] Open
Abstract
Despite the fundamental importance of single nucleotide polymorphisms (SNPs) to human evolution, there are still large gaps in our understanding of the forces that shape their distribution across the genome. SNPs have been shown to not be distributed evenly, with directly adjacent SNPs found unusually frequently. Why this is the case is unclear. We illustrate how neighboring SNPs that cannot be explained by a single mutation event (that we term here sequential dinucleotide mutations [SDMs]) are driven by distinct processes to SNPs and multinucleotide polymorphisms (MNPs). By studying variation across populations, including a novel cohort of 1,358 Scottish genomes, we show that, SDMs are over twice as common as MNPs and like SNPs display distinct mutational spectra across populations. These biases are not only different to those observed among SNPs and MNPs but are also more divergent between human population groups. We show that the changes that make up SDMs are not independent and identify a distinct mutational profile, CA → CG → TG, that is observed an order of magnitude more often than expected from background SNP rates and the numbers of other SDMs involving the gain and deamination of CpG sites. Intriguingly particular pathways through the amino acid code appear to have been favored relative to that expected from intergenic SDM rates and the occurrences of coding SNPs, and in particular those that lead to the creation of single codon amino acids. We finally present evidence that epistatic selection has potentially disfavored sequential nonsynonymous changes in the human genome.
Collapse
Affiliation(s)
| | - Carys Pugh
- The Roslin Institute, The University of Edinburgh, Midlothian, United Kingdom.,Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom
| | - Sarah E Harris
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom.,Centre for Genomic and Experimental Medicine, MRC Institute of Genetics and Molecular Medicine, The University of Edinburgh, United Kingdom
| | - David A Hume
- Mater Research Institute-University of Queensland, Woolloongabba, Queensland, Australia
| | - Ian J Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom
| | - Allan Beveridge
- Glasgow Polyomics, College of Medical, Veterinary and Life Science, University of Glasgow, United Kingdom
| |
Collapse
|
39
|
Dunn KA, Kenney T, Gu H, Bielawski JP. Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates. BMC Evol Biol 2019; 19:22. [PMID: 30642241 PMCID: PMC6332903 DOI: 10.1186/s12862-018-1326-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/11/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An excess of nonsynonymous substitutions, over neutrality, is considered evidence of positive Darwinian selection. Inference for proteins often relies on estimation of the nonsynonymous to synonymous ratio (ω = dN/dS) within a codon model. However, to ease computational difficulties, ω is typically estimated assuming an idealized substitution process where (i) all nonsynonymous substitutions have the same rate (regardless of impact on organism fitness) and (ii) instantaneous double and triple (DT) nucleotide mutations have zero probability (despite evidence that they can occur). It follows that estimates of ω represent an imperfect summary of the intensity of selection, and that tests based on the ω > 1 threshold could be negatively impacted. RESULTS We developed a general-purpose parametric (GPP) modelling framework for codons. This novel approach allows specification of all possible instantaneous codon substitutions, including multiple nonsynonymous rates (MNRs) and instantaneous DT nucleotide changes. Existing codon models are specified as special cases of the GPP model. We use GPP models to implement likelihood ratio tests for ω > 1 that accommodate MNRs and DT mutations. Through both simulation and real data analysis, we find that failure to model MNRs and DT mutations reduces power in some cases and inflates false positives in others. False positives under traditional M2a and M8 models were very sensitive to DT changes. This was exacerbated by the choice of frequency parameterization (GY vs. MG), with rates sometimes > 90% under MG. By including MNRs and DT mutations, accuracy and power was greatly improved under the GPP framework. However, we also find that over-parameterized models can perform less well, and this can contribute to degraded performance of LRTs. CONCLUSIONS We suggest GPP models should be used alongside traditional codon models. Further, all codon models should be deployed within an experimental design that includes (i) assessing robustness to model assumptions, and (ii) investigation of non-standard behaviour of MLEs. As the goal of every analysis is to avoid false conclusions, more work is needed on model selection methods that consider both the increase in fit engendered by a model parameter and the degree to which that parameter is affected by un-modelled evolutionary processes.
Collapse
Affiliation(s)
- Katherine A. Dunn
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Toby Kenney
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Hong Gu
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Joseph P. Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Centre Comparative Genomics and Evolutionary Bioinformatics (CGEB) at Dalhousie University, Halifax, Canada
| |
Collapse
|
40
|
Looking for Darwin in Genomic Sequences: Validity and Success Depends on the Relationship Between Model and Data. Methods Mol Biol 2019; 1910:399-426. [PMID: 31278672 DOI: 10.1007/978-1-4939-9074-0_13] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Codon substitution models (CSMs) are commonly used to infer the history of natural section for a set of protein-coding sequences, often with the explicit goal of detecting the signature of positive Darwinian selection. However, the validity and success of CSMs used in conjunction with the maximum likelihood (ML) framework is sometimes challenged with claims that the approach might too often support false conclusions. In this chapter, we use a case study approach to identify four legitimate statistical difficulties associated with inference of evolutionary events using CSMs. These include: (1) model misspecification, (2) low information content, (3) the confounding of processes, and (4) phenomenological load, or PL. While past criticisms of CSMs can be connected to these issues, the historical critiques were often misdirected, or overstated, because they failed to recognize that the success of any model-based approach depends on the relationship between model and data. Here, we explore this relationship and provide a candid assessment of the limitations of CSMs to extract historical information from extant sequences. To aid in this assessment, we provide a brief overview of: (1) a more realistic way of thinking about the process of codon evolution framed in terms of population genetic parameters, and (2) a novel presentation of the ML statistical framework. We then divide the development of CSMs into two broad phases of scientific activity and show that the latter phase is characterized by increases in model complexity that can sometimes negatively impact inference of evolutionary mechanisms. Such problems are not yet widely appreciated by the users of CSMs. These problems can be avoided by using a model that is appropriate for the data; but, understanding the relationship between the data and a fitted model is a difficult task. We argue that the only way to properly understand that relationship is to perform in silico experiments using a generating process that can mimic the data as closely as possible. The mutation-selection modeling framework (MutSel) is presented as the basis of such a generating process. We contend that if complex CSMs continue to be developed for testing explicit mechanistic hypotheses, then additional analyses such as those described in here (e.g., penalized LRTs and estimation of PL) will need to be applied alongside the more traditional inferential methods.
Collapse
|
41
|
Multinucleotide mutations cause false inferences of lineage-specific positive selection. Nat Ecol Evol 2018; 2:1280-1288. [PMID: 29967485 PMCID: PMC6093625 DOI: 10.1038/s41559-018-0584-5] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 05/18/2018] [Indexed: 11/08/2022]
Abstract
Phylogenetic tests of adaptive evolution, such as the widely used branch-site test, assume that nucleotide substitutions occur singly and independently. But recent research has shown that errors at adjacent sites often occur during DNA replication, and the resulting multinucleotide mutations (MNMs) are overwhelmingly likely to be nonsynonymous. We evaluated whether the branch-site test (BST) might misinterpret sequence patterns produced by MNMs as false support for positive selection. We analyzed two genome-scale datasets– one from mammals and one from flies – and found that codons with multiple differences account for virtually all the support for lineage-specific positive selection in the BST. Simulations under conditions derived from these alignments but without positive selection show that realistic rates of MNMs cause a strong and systematic bias towards false inferences of selection. This bias is sufficient under empirically derived conditions to produce false positive inferences as often as the branch-site test infers positive selection from the empirical data. Although some genes with BST-positive results may have evolved adaptively, the test cannot distinguish sequence patterns produced by authentic positive selection from those caused by neutral fixation of MNMs. Many published inferences of adaptive evolution using this technique may therefore be artifacts of model violation caused by unincorporated neutral mutational processes. We introduce a model that incorporates MNMs and may help to ameliorate this bias.
Collapse
|
42
|
Rahman A, Hallgrímsdóttir I, Eisen M, Pachter L. Association mapping from sequencing reads using k-mers. eLife 2018; 7:e32920. [PMID: 29897334 PMCID: PMC6044908 DOI: 10.7554/elife.32920] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 06/08/2018] [Indexed: 01/05/2023] Open
Abstract
Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting [Formula: see text]-mers in whole-genome sequencing reads, testing for associations directly between [Formula: see text]-mers and the trait of interest, and local assembly of the statistically significant [Formula: see text]-mers to identify sequence differences. An analysis of the 1000 genomes data show that sequences identified by our method largely agree with results obtained using the standard approach. However, unlike standard GWAS, our method identifies associations with structural variations and sites not present in the reference genome. We also demonstrate that population stratification can be inferred from [Formula: see text]-mers. Finally, application to an E.coli dataset on ampicillin resistance validates the approach.
Collapse
Affiliation(s)
- Atif Rahman
- Department of Electrical Engineering and Computer SciencesUniversity of California, BerkeleyBerkeleyUnited States
| | | | - Michael Eisen
- Department of Molecular and Cell BiologyUniversity of California, BerkeleyBerkeleyUnited States
- Howard Hughes Medical Institute, University of California, BerkeleyBerkeleyUnited States
| | - Lior Pachter
- Department of Electrical Engineering and Computer SciencesUniversity of California, BerkeleyBerkeleyUnited States
- Department of Molecular and Cell BiologyUniversity of California, BerkeleyBerkeleyUnited States
- Department of MathematicsUniversity of California, BerkeleyBerkeleyUnited States
| |
Collapse
|
43
|
Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence. Nat Genet 2018; 50:487-492. [PMID: 29507425 DOI: 10.1038/s41588-018-0071-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 01/29/2018] [Indexed: 11/08/2022]
Abstract
Clustering of mutations has been observed in cancer genomes as well as for germline de novo mutations (DNMs). We identified 1,796 clustered DNMs (cDNMs) within whole-genome-sequencing data from 1,291 parent-offspring trios to investigate their patterns and infer a mutational mechanism. We found that the number of clusters on the maternal allele was positively correlated with maternal age and that these clusters consisted of more individual mutations with larger intermutational distances than those of paternal clusters. More than 50% of maternal clusters were located on chromosomes 8, 9 and 16, in previously identified regions with accelerated maternal mutation rates. Maternal clusters in these regions showed a distinct mutation signature characterized by C>G transversions. Finally, we found that maternal clusters were associated with processes involving double-strand-breaks (DSBs), such as meiotic gene conversions and de novo deletion events. This result suggested accumulation of DSB-induced mutations throughout oocyte aging as the mechanism underlying the formation of maternal mutation clusters.
Collapse
|
44
|
Harris K. Reading the genome like a history book. Science 2017; 358:1265. [DOI: 10.1126/science.aar2003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Mathematical modeling sheds light on the evolution of human genetic variation
Collapse
|
45
|
Assaf ZJ, Tilk S, Park J, Siegal ML, Petrov DA. Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations. Genome Res 2017; 27:1988-2000. [PMID: 29079675 PMCID: PMC5741049 DOI: 10.1101/gr.219956.116] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 10/20/2017] [Indexed: 11/25/2022]
Abstract
Mutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on having precise measurements of mutational rates and patterns. We generate a data set for this purpose using (1) de novo mutations from mutation accumulation experiments and (2) extremely rare polymorphisms from natural populations. The first, mutation accumulation (MA) lines are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. The second, rare genetic variation from natural populations allows the study of mutation because extremely rare polymorphisms are relatively unaffected by the filter of natural selection. We use both methods in Drosophila melanogaster, first generating our own novel data set of sequenced MA lines and performing a meta-analysis of all published MA mutations (∼2000 events) and then identifying a high quality set of ∼70,000 extremely rare (≤0.1%) polymorphisms that are fully validated with resequencing. We use these data sets to precisely measure mutational rates and patterns. Highlights of our results include: a high rate of multinucleotide mutation events at both short (∼5 bp) and long (∼1 kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and using our precise context-dependent mutation rates to predict long-term evolutionary patterns at synonymous sites. We also show that de novo mutations from independent MA experiments display similar patterns of single nucleotide mutation and well match the patterns of mutation found in natural populations.
Collapse
Affiliation(s)
- Zoe June Assaf
- Department of Genetics, Stanford University, Stanford, California 94305, USA.,Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Susanne Tilk
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Jane Park
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Mark L Siegal
- Department of Biology, New York University, New York, New York 10003, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
46
|
Abstract
microRNAs are currently believed to control a large diversity of physiologic processes, through the collective repression of thousands of target genes. Both experimental and computational analyses indeed suggest that each microRNA regulates tens or hundreds of genes. But some observations suggest that the phenotypic consequences of many published miRNA/mRNA interactions are dubious. For example, the reported amplitude of miRNA-guided repression is very small, while biologic processes tend to be robust to small changes in gene expression. We recently showed, on one particular miRNA, that for most predicted targets, miRNA-guided repression is even smaller than inter-individual variability among wild-type specimens. We also put forward several sources of computational false positives. These issues are generally neglected by the scientific community, probably resulting in the frequent publication of irreproducible or misinterpreted results regarding microRNA function. We propose novel types of analyses, easily accessible to the community, that could help improve microRNA target identification.
Collapse
Affiliation(s)
- Hervé Seitz
- a Institut de Génétique Humaine UMR 9002 CNRS-Université de Montpellier , 141, rue de la Cardonille, 34396 Montpellier CEDEX 5 , France
| |
Collapse
|
47
|
Löytynoja A, Goldman N. Short template switch events explain mutation clusters in the human genome. Genome Res 2017; 27:1039-1049. [PMID: 28385709 PMCID: PMC5453318 DOI: 10.1101/gr.214973.116] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 03/28/2017] [Indexed: 01/19/2023]
Abstract
Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra- and inter-species studies is the high frequency of complex mutations comprising multiple nearby base substitutions or insertion-deletions. We devised a generalized mutation model of template switching during replication that extends existing models of genome rearrangement and used this to study the role of template switch events in the origin of short mutation clusters. Applied to the human genome, our model detects thousands of template switch events during the evolution of human and chimp from their common ancestor and hundreds of events between two independently sequenced human genomes. Although many of these are consistent with a template switch mechanism previously proposed for bacteria, our model also identifies new types of mutations that create short inversions, some flanked by paired inverted repeats. The local template switch process can create numerous complex mutation patterns, including hairpin loop structures, and explains multinucleotide mutations and compensatory substitutions without invoking positive selection, speculative mechanisms, or implausible coincidence. Clustered sequence differences are challenging for current mapping and variant calling methods, and we show that many erroneous variant annotations exist in human reference data. Local template switch events may have been neglected as an explanation for complex mutations because of biases in commonly used analyses. Incorporation of our model into reference-based analysis pipelines and comparisons of de novo assembled genomes will lead to improved understanding of genome variation and evolution.
Collapse
Affiliation(s)
- Ari Löytynoja
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, United Kingdom
| |
Collapse
|
48
|
Seplyarskiy VB, Andrianova MA, Bazykin GA. APOBEC3A/B-induced mutagenesis is responsible for 20% of heritable mutations in the TpCpW context. Genome Res 2016; 27:175-184. [PMID: 27940951 PMCID: PMC5287224 DOI: 10.1101/gr.210336.116] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 12/01/2016] [Indexed: 12/18/2022]
Abstract
APOBEC3A/B cytidine deaminase is responsible for the majority of cancerous mutations in a large fraction of cancer samples. However, its role in heritable mutagenesis remains very poorly understood. Recent studies have demonstrated that both in yeast and in human cancerous cells, most APOBEC3A/B-induced mutations occur on the lagging strand during replication and on the nontemplate strand of transcribed regions. Here, we use data on rare human polymorphisms, interspecies divergence, and de novo mutations to study germline mutagenesis and to analyze mutations at nucleotide contexts prone to attack by APOBEC3A/B. We show that such mutations occur preferentially on the lagging strand and on nontemplate strands of transcribed regions. Moreover, we demonstrate that APOBEC3A/B-like mutations tend to produce strand-coordinated clusters, which are also biased toward the lagging strand. Finally, we show that the mutation rate is increased 3' of C→G mutations to a greater extent than 3' of C→T mutations, suggesting pervasive trans-lesion bypass of the APOBEC3A/B-induced damage. Our study demonstrates that 20% of C→T and C→G mutations in the TpCpW context-where W denotes A or T, segregating as polymorphisms in human population-or 1.4% of all heritable mutations are attributable to APOBEC3A/B activity.
Collapse
Affiliation(s)
- Vladimir B Seplyarskiy
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Maria A Andrianova
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia.,Lomonosov Moscow State University, Moscow 119234, Russia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Moscow 127994, Russia.,Pirogov Russian National Research Medical University, Moscow 117997, Russia.,Lomonosov Moscow State University, Moscow 119234, Russia.,Skolkovo Institute of Science and Technology, Skolkovo 143026, Russia
| |
Collapse
|
49
|
Besenbacher S, Sulem P, Helgason A, Helgason H, Kristjansson H, Jonasdottir A, Jonasdottir A, Magnusson OT, Thorsteinsdottir U, Masson G, Kong A, Gudbjartsson DF, Stefansson K. Multi-nucleotide de novo Mutations in Humans. PLoS Genet 2016; 12:e1006315. [PMID: 27846220 PMCID: PMC5147774 DOI: 10.1371/journal.pgen.1006315] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 08/22/2016] [Indexed: 01/23/2023] Open
Abstract
Mutation of the DNA molecule is one of the most fundamental processes in biology. In this study, we use 283 parent-offspring trios to estimate the rate of mutation for both single nucleotide variants (SNVs) and short length variants (indels) in humans and examine the mutation process. We found 17812 SNVs, corresponding to a mutation rate of 1.29 × 10-8 per position per generation (PPPG) and 1282 indels corresponding to a rate of 9.29 × 10-10 PPPG. We estimate that around 3% of human de novo SNVs are part of a multi-nucleotide mutation (MNM), with 558 (3.1%) of mutations positioned less than 20kb from another mutation in the same individual (median distance of 525bp). The rate of de novo mutations is greater in late replicating regions (p = 8.29 × 10-19) and nearer recombination events (p = 0.0038) than elsewhere in the genome.
Collapse
Affiliation(s)
| | | | - Agnar Helgason
- deCODE genetics/Amgen, Inc., Iceland.,Department of Anthropology, University of Iceland, Iceland
| | - Hannes Helgason
- deCODE genetics/Amgen, Inc., Iceland.,School of Engineering and Natural Sciences, University of Iceland, Iceland
| | | | | | | | | | - Unnur Thorsteinsdottir
- deCODE genetics/Amgen, Inc., Iceland.,Faculty of Medicine, University of Iceland, Iceland
| | | | | | - Daniel F Gudbjartsson
- deCODE genetics/Amgen, Inc., Iceland.,School of Engineering and Natural Sciences, University of Iceland, Iceland
| | - Kari Stefansson
- deCODE genetics/Amgen, Inc., Iceland.,Faculty of Medicine, University of Iceland, Iceland
| |
Collapse
|
50
|
Novembre J, Peter BM. Recent advances in the study of fine-scale population structure in humans. Curr Opin Genet Dev 2016; 41:98-105. [PMID: 27662060 DOI: 10.1016/j.gde.2016.08.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 08/18/2016] [Accepted: 08/24/2016] [Indexed: 01/17/2023]
Abstract
Empowered by modern genotyping and large samples, population structure can be accurately described and quantified even when it only explains a fraction of a percent of total genetic variance. This is especially relevant and interesting for humans, where fine-scale population structure can both confound disease-mapping studies and reveal the history of migration and divergence that shaped our species' diversity. Here we review notable recent advances in the detection, use, and understanding of population structure. Our work addresses multiple areas where substantial progress is being made: improved statistics and models for better capturing differentiation, admixture, and the spatial distribution of variation; computational speed-ups that allow methods to scale to modern data; and advances in haplotypic modeling that have wide ranging consequences for the analysis of population structure. We conclude by outlining four important open challenges: the limitations of discrete population models, uncertainty in individual origins, the incorporation of both fine-scale structure and ancient DNA in parametric models, and the development of efficient computational tools, particularly for haplotype-based methods.
Collapse
Affiliation(s)
- John Novembre
- Department of Human Genetics, University of Chicago, IL 60636, United States; Department of Ecology and Evolutionary Biology, University of Chicago, IL 60636, United States
| | - Benjamin M Peter
- Department of Human Genetics, University of Chicago, IL 60636, United States
| |
Collapse
|