1
|
Catania EM, Dubs NM, Soumen S, Barkman TJ. The Mutational Road not Taken: Using Ancestral Sequence Resurrection to Evaluate the Evolution of Plant Enzyme Substrate Preferences. Genome Biol Evol 2024; 16:evae016. [PMID: 38290535 PMCID: PMC10853004 DOI: 10.1093/gbe/evae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024] Open
Abstract
We investigated the flowering plant salicylic acid methyl transferase (SAMT) enzyme lineage to understand the evolution of substrate preference change. Previous studies indicated that a single amino acid replacement to the SAMT active site (H150M) was sufficient to change ancestral enzyme substrate preference from benzoic acid to the structurally similar substrate, salicylic acid (SA). Yet, subsequent studies have shown that the H150M function-changing replacement did not likely occur during the historical episode of enzymatic divergence studied. Therefore, we reinvestigated the origin of SA methylation preference here and additionally assessed the extent to which epistasis may act to limit mutational paths. We found that the SAMT lineage of enzymes acquired preference to methylate SA from an ancestor that preferred to methylate benzoic acid as previously reported. In contrast, we found that a different amino acid replacement, Y267Q, was sufficient to change substrate preference with others providing small positive-magnitude epistatic improvements. We show that the kinetic basis for the ancestral enzymatic change in substate preference by Y267Q appears to be due to both a reduced specificity constant, kcat/KM, for benzoic acid and an improvement in KM for SA. Therefore, this lineage of enzymes appears to have had multiple mutational paths available to achieve the same evolutionary divergence. While the reasons remain unclear for why one path was taken, and the other was not, the mutational distance between ancestral and descendant codons may be a factor.
Collapse
Affiliation(s)
- Emily M Catania
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI 49008, USA
| | - Nicole M Dubs
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI 49008, USA
| | - Shejal Soumen
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI 49008, USA
| | - Todd J Barkman
- Department of Biological Sciences, Western Michigan University, Kalamazoo, MI 49008, USA
| |
Collapse
|
2
|
Feldmeyer B, Bornberg-Bauer E, Dohmen E, Fouks B, Heckenhauer J, Huylmans AK, Jones ARC, Stolle E, Harrison MC. Comparative Evolutionary Genomics in Insects. Methods Mol Biol 2024; 2802:473-514. [PMID: 38819569 DOI: 10.1007/978-1-0716-3838-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Genome sequencing quality, in terms of both read length and accuracy, is constantly improving. By combining long-read sequencing technologies with various scaffolding techniques, chromosome-level genome assemblies are now achievable at an affordable price for non-model organisms. Insects represent an exciting taxon for studying the genomic underpinnings of evolutionary innovations, due to ancient origins, immense species-richness, and broad phenotypic diversity. Here we summarize some of the most important methods for carrying out a comparative genomics study on insects. We describe available tools and offer concrete tips on all stages of such an endeavor from DNA extraction through genome sequencing, annotation, and several evolutionary analyses. Along the way we describe important insect-specific aspects, such as DNA extraction difficulties or gene families that are particularly difficult to annotate, and offer solutions. We describe results from several examples of comparative genomics analyses on insects to illustrate the fascinating questions that can now be addressed in this new age of genomics research.
Collapse
Affiliation(s)
- Barbara Feldmeyer
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Molecular Ecology, Frankfurt, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Bertrand Fouks
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jacqueline Heckenhauer
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Germany
| | - Ann Kathrin Huylmans
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University, Mainz, Germany
| | - Alun R C Jones
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Eckart Stolle
- Museum Koenig, Leibniz Institute for the Analysis of Biodiversity Change (LIB), Bonn, Germany
| | - Mark C Harrison
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
3
|
Lynch M, Ali F, Lin T, Wang Y, Ni J, Long H. The divergence of mutation rates and spectra across the Tree of Life. EMBO Rep 2023; 24:e57561. [PMID: 37615267 PMCID: PMC10561183 DOI: 10.15252/embr.202357561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/25/2023] Open
Abstract
Owing to advances in genome sequencing, genome stability has become one of the most scrutinized cellular traits across the Tree of Life. Despite its centrality to all things biological, the mutation rate (per nucleotide site per generation) ranges over three orders of magnitude among species and several-fold within individual phylogenetic lineages. Within all major organismal groups, mutation rates scale negatively with the effective population size of a species and with the amount of functional DNA in the genome. This relationship is most parsimoniously explained by the drift-barrier hypothesis, which postulates that natural selection typically operates to reduce mutation rates until further improvement is thwarted by the power of random genetic drift. Despite this constraint, the molecular mechanisms underlying DNA replication fidelity and repair are free to wander, provided the performance of the entire system is maintained at the prevailing level. The evolutionary flexibility of the mutation rate bears on the resolution of several prior conundrums in phylogenetic and population-genetic analysis and raises challenges for future applications in these areas.
Collapse
Affiliation(s)
- Michael Lynch
- Biodesign Center for Mechanisms of EvolutionArizona State UniversityTempeAZUSA
| | - Farhan Ali
- Biodesign Center for Mechanisms of EvolutionArizona State UniversityTempeAZUSA
| | - Tongtong Lin
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Yaohai Wang
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Jiahao Ni
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| | - Hongan Long
- Institute of Evolution and Marine Biodiversity, KLMMEOcean University of ChinaQingdaoChina
| |
Collapse
|
4
|
Gitschlag BL, Cano AV, Payne JL, McCandlish DM, Stoltzfus A. Mutation and Selection Induce Correlations between Selection Coefficients and Mutation Rates. Am Nat 2023; 202:534-557. [PMID: 37792926 DOI: 10.1086/726014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
AbstractThe joint distribution of selection coefficients and mutation rates is a key determinant of the genetic architecture of molecular adaptation. Three different distributions are of immediate interest: (1) the "nominal" distribution of possible changes, prior to mutation or selection; (2) the "de novo" distribution of realized mutations; and (3) the "fixed" distribution of selectively established mutations. Here, we formally characterize the relationships between these joint distributions under the strong-selection/weak-mutation (SSWM) regime. The de novo distribution is enriched relative to the nominal distribution for the highest rate mutations, and the fixed distribution is further enriched for the most highly beneficial mutations. Whereas mutation rates and selection coefficients are often assumed to be uncorrelated, we show that even with no correlation in the nominal distribution, the resulting de novo and fixed distributions can have correlations with any combination of signs. Nonetheless, we suggest that natural systems with a finite number of beneficial mutations will frequently have the kind of nominal distribution that induces negative correlations in the fixed distribution. We apply our mathematical framework, along with population simulations, to explore joint distributions of selection coefficients and mutation rates from deep mutational scanning and cancer informatics. Finally, we consider the evolutionary implications of these joint distributions together with two additional joint distributions relevant to parallelism and the rate of adaptation.
Collapse
|
5
|
Suárez-Menéndez M, Bérubé M, Furni F, Rivera-León VE, Heide-Jørgensen MP, Larsen F, Sears R, Ramp C, Eriksson BK, Etienne RS, Robbins J, Palsbøll PJ. Wild pedigrees inform mutation rates and historic abundance in baleen whales. Science 2023; 381:990-995. [PMID: 37651509 DOI: 10.1126/science.adf2160] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 07/25/2023] [Indexed: 09/02/2023]
Abstract
Phylogeny-based estimates suggesting a low germline mutation rate (μ) in baleen whales have influenced research ranging from assessments of whaling impacts to evolutionary cancer biology. We estimated μ directly from pedigrees in four baleen whale species for both the mitochondrial control region and nuclear genome. The results suggest values higher than those obtained through phylogeny-based estimates and similar to pedigree-based values for primates and toothed whales. Applying our estimate of μ reduces previous genetic-based estimates of preexploitation whale abundance by 86% and suggests that μ cannot explain low cancer rates in gigantic mammals. Our study shows that it is feasible to estimate μ directly from pedigrees in natural populations, with wide-ranging implications for ecological and evolutionary research.
Collapse
Affiliation(s)
- Marcos Suárez-Menéndez
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
| | - Martine Bérubé
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
- Center for Coastal Studies, Provincetown, MA, USA
| | - Fabrício Furni
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
| | - Vania E Rivera-León
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
| | | | - Finn Larsen
- National Institute of Aquatic Resources, Kongens Lyngby, Denmark
| | - Richard Sears
- Mingan Island Cetacean Study Inc., St. Lambert, Quebec, Canada
| | - Christian Ramp
- Mingan Island Cetacean Study Inc., St. Lambert, Quebec, Canada
- Scottish Oceans Institute, University of St. Andrews, St. Andrews, UK
| | - Britas Klemens Eriksson
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
| | - Rampal S Etienne
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
| | | | - Per J Palsbøll
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, Netherlands
- Center for Coastal Studies, Provincetown, MA, USA
| |
Collapse
|
6
|
Lucaci AG, Zehr JD, Enard D, Thornton JW, Kosakovsky Pond SL. Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses. Mol Biol Evol 2023; 40:msad150. [PMID: 37395787 PMCID: PMC10336034 DOI: 10.1093/molbev/msad150] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/15/2023] [Accepted: 06/26/2023] [Indexed: 07/04/2023] Open
Abstract
Inference and interpretation of evolutionary processes, in particular of the types and targets of natural selection affecting coding sequences, are critically influenced by the assumptions built into statistical models and tests. If certain aspects of the substitution process (even when they are not of direct interest) are presumed absent or are modeled with too crude of a simplification, estimates of key model parameters can become biased, often systematically, and lead to poor statistical performance. Previous work established that failing to accommodate multinucleotide (or multihit, MH) substitutions strongly biases dN/dS-based inference towards false-positive inferences of diversifying episodic selection, as does failing to model variation in the rate of synonymous substitution (SRV) among sites. Here, we develop an integrated analytical framework and software tools to simultaneously incorporate these sources of evolutionary complexity into selection analyses. We found that both MH and SRV are ubiquitous in empirical alignments, and incorporating them has a strong effect on whether or not positive selection is detected (1.4-fold reduction) and on the distributions of inferred evolutionary rates. With simulation studies, we show that this effect is not attributable to reduced statistical power caused by using a more complex model. After a detailed examination of 21 benchmark alignments and a new high-resolution analysis showing which parts of the alignment provide support for positive selection, we show that MH substitutions occurring along shorter branches in the tree explain a significant fraction of discrepant results in selection detection. Our results add to the growing body of literature which examines decades-old modeling assumptions (including MH) and finds them to be problematic for comparative genomic data analysis. Because multinucleotide substitutions have a significant impact on natural selection detection even at the level of an entire gene, we recommend that selection analyses of this type consider their inclusion as a matter of routine. To facilitate this procedure, we developed, implemented, and benchmarked a simple and well-performing model testing selection detection framework able to screen an alignment for positive selection with two biologically important confounding processes: site-to-site synonymous rate variation, and multinucleotide instantaneous substitutions.
Collapse
Affiliation(s)
- Alexander G Lucaci
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Jordan D Zehr
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona
| | - Joseph W Thornton
- Department of Human Genetics, University of Chicago, Chicago, Illinois
- Department of Ecology & Evolution, University of Chicago, Chicago, Illinois
| | | |
Collapse
|
7
|
Cano AV, Gitschlag BL, Rozhoňová H, Stoltzfus A, McCandlish DM, Payne JL. Mutation bias and the predictability of evolution. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220055. [PMID: 37004719 PMCID: PMC10067271 DOI: 10.1098/rstb.2022.0055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2023] Open
Abstract
Predicting evolutionary outcomes is an important research goal in a diversity of contexts. The focus of evolutionary forecasting is usually on adaptive processes, and efforts to improve prediction typically focus on selection. However, adaptive processes often rely on new mutations, which can be strongly influenced by predictable biases in mutation. Here, we provide an overview of existing theory and evidence for such mutation-biased adaptation and consider the implications of these results for the problem of prediction, in regard to topics such as the evolution of infectious diseases, resistance to biochemical agents, as well as cancer and other kinds of somatic evolution. We argue that empirical knowledge of mutational biases is likely to improve in the near future, and that this knowledge is readily applicable to the challenges of short-term prediction. This article is part of the theme issue 'Interdisciplinary approaches to predicting evolutionary biology'.
Collapse
Affiliation(s)
- Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, 8092 Zurich, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Bryan L Gitschlag
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Hana Rozhoňová
- Institute of Integrative Biology, ETH Zurich, 8092 Zurich, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Arlin Stoltzfus
- Office of Data and Informatics, Material Measurement Laboratory, National Institute of Standards and Technology, Rockville, MD 20899, USA
- Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, 8092 Zurich, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Fan W, Liu F, Jia Q, Du H, Chen W, Ruan J, Lei J, Li DZ, Mower JP, Zhu A. Fragaria mitogenomes evolve rapidly in structure but slowly in sequence and incur frequent multinucleotide mutations mediated by microinversions. THE NEW PHYTOLOGIST 2022; 236:745-759. [PMID: 35731093 DOI: 10.1111/nph.18334] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
Plant mitochondrial DNA has been described as evolving rapidly in structure but slowly in sequence. However, many of the noncoding portions of plant mitogenomes are not homologous among species, raising questions about the rate and spectrum of mutations in noncoding regions. Recent studies have suggested that the lack of homology in noncoding regions could be due to increased sequence divergence. We compared 30 kb of coding and 200 kb of noncoding DNA from 13 sequenced Fragaria mitogenomes, followed by analysis of the rate of sequence divergence, microinversion events and structural variations. Substitution rates in synonymous sites and nongenic sites are nearly identical, suggesting that the genome-wide point mutation rate is generally consistent. A surprisingly high number of large multinucleotide substitutions were detected in Fragaria mitogenomes, which may have resulted from microinversion events and could affect phylogenetic signal and local rate estimates. Fragaria mitogenomes preferentially accumulate deletions relative to insertions and substantial genomic arrangements, whereas mutation rates could positively associate with these sequence and structural changes among species. Together, these observations suggest that plant mitogenomes exhibit low point mutations genome-wide but exceptionally high structural variations, and our results favour a gain-and-loss model for the rapid loss of homology among plant mitogenomes.
Collapse
Affiliation(s)
- Weishu Fan
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Fang Liu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qiaoya Jia
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
- School of Life Sciences, Yunnan University, Kunming, Yunnan, 650500, China
| | - Haiyuan Du
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wu Chen
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiwei Ruan
- Flower Research Institute, Yunnan Academy of Agricultural Sciences, Kunming, Yunnan, 650205, China
| | - Jiajun Lei
- College of Horticulture, Shenyang Agricultural University, Shenyang, Liaoning, 110866, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Jeffrey P Mower
- Center for Plant Science Innovation, University of Nebraska, Lincoln, NE, 68588, USA
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE, 68583, USA
| | - Andan Zhu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| |
Collapse
|
9
|
Hasan AR, Lachapelle J, El-Shawa SA, Potjewyd R, Ford SA, Ness RW. Salt stress alters the spectrum of de novo mutation available to selection during experimental adaptation of Chlamydomonas reinhardtii. Evolution 2022; 76:2450-2463. [PMID: 36036481 DOI: 10.1111/evo.14604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 08/12/2022] [Indexed: 01/22/2023]
Abstract
The genetic basis of adaptation is driven by both selection and the spectrum of available mutations. Given that the rate of mutation is not uniformly distributed across the genome and varies depending on the environment, understanding the signatures of selection across the genome is aided by first establishing what the expectations of genetic change are from mutation. To determine the interaction between salt stress, selection, and mutation across the genome, we compared mutations observed in a selection experiment for salt tolerance in Chlamydomonas reinhardtii to those observed in mutation accumulation (MA) experiments with and without salt exposure. MA lines evolved under salt stress had a single-nucleotide mutation rate of 1.1 × 10 - 9 $1.1 \times 10^{-9}$ , similar to that of MA lines under standard conditions ( 9.6 × 10 - 10 $9.6 \times 10^{-10}$ ). However, we found that salt stress led to an increased rate of indel mutations, but that many of these mutations were removed under selection. Finally, lines adapted to salt also showed excess clustering of mutations in the genome and the co-expression network, suggesting a role for positive selection in retaining mutations in particular compartments of the genome during the evolution of salt tolerance. Our study shows that characterizing mutation rates and spectra expected under stress helps disentangle the effects of environment and selection during adaptation.
Collapse
Affiliation(s)
- Ahmed R Hasan
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Josianne Lachapelle
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Sara A El-Shawa
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada.,Department of Mathematical and Computational Sciences, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Roman Potjewyd
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Scott A Ford
- Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada
| | - Rob W Ness
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Biology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, M5S 3B2, Canada
| |
Collapse
|
10
|
Mohiuddin M, Kooy RF, Pearson CE. De novo mutations, genetic mosaicism and human disease. Front Genet 2022; 13:983668. [PMID: 36226191 PMCID: PMC9550265 DOI: 10.3389/fgene.2022.983668] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 09/08/2022] [Indexed: 11/23/2022] Open
Abstract
Mosaicism—the existence of genetically distinct populations of cells in a particular organism—is an important cause of genetic disease. Mosaicism can appear as de novo DNA mutations, epigenetic alterations of DNA, and chromosomal abnormalities. Neurodevelopmental or neuropsychiatric diseases, including autism—often arise by de novo mutations that usually not present in either of the parents. De novo mutations might occur as early as in the parental germline, during embryonic, fetal development, and/or post-natally, through ageing and life. Mutation timing could lead to mutation burden of less than heterozygosity to approaching homozygosity. Developmental timing of somatic mutation attainment will affect the mutation load and distribution throughout the body. In this review, we discuss the timing of de novo mutations, spanning from mutations in the germ lineage (all ages), to post-zygotic, embryonic, fetal, and post-natal events, through aging to death. These factors can determine the tissue specific distribution and load of de novo mutations, which can affect disease. The disease threshold burden of somatic de novo mutations of a particular gene in any tissue will be important to define.
Collapse
Affiliation(s)
- Mohiuddin Mohiuddin
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- *Correspondence: Mohiuddin Mohiuddin, ; Christopher E. Pearson,
| | - R. Frank Kooy
- Department of Medical Genetics, University of Antwerp, Edegem, Belgium
| | - Christopher E. Pearson
- Program of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- *Correspondence: Mohiuddin Mohiuddin, ; Christopher E. Pearson,
| |
Collapse
|
11
|
Belinky F, Bykova A, Yurchenko V, Rogozin IB. No evidence for widespread positive selection on double substitutions within codons in primates and yeasts. Front Genet 2022; 13:991249. [PMID: 36159983 PMCID: PMC9500374 DOI: 10.3389/fgene.2022.991249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 08/29/2022] [Indexed: 11/13/2022] Open
Abstract
Nucleotide substitutions in protein-coding genes can be divided into synonymous (S) and non-synonymous (N) ones that alter amino acids (including nonsense mutations causing stop codons). The S substitutions are expected to have little effect on function. The N substitutions almost always are affected by strong purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single N substitutions and, thus, could be subjected to the positive selection. This effect has been demonstrated for mutations in the serine codons, stop codons and double N substitutions in prokaryotes. In all abovementioned cases, a novel technique was applied that allows elucidating the effects of selection on double substitutions considering mutational biases. Here, we applied the same technique to study double N substitutions in eukaryotic lineages of primates and yeast. We identified markedly fewer cases of purifying selection relative to prokaryotes and no evidence of codon double substitutions under positive selection. This is consistent with previous studies of serine codons in primates and yeast. In general, the obtained results strongly suggest that there are major differences between studied pro- and eukaryotes; double substitutions in primates and yeasts largely reflect mutational biases and are not hallmarks of selection. This is especially important in the context of detection of positive selection in codons because it has been suggested that multiple mutations in codons cause false inferences of lineage-specific site positive selection. It is likely that this concern is applicable to previously studied prokaryotes but not to primates and yeasts where markedly fewer double substitutions are affected by positive selection.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Anastassia Bykova
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Vyacheslav Yurchenko
- Life Science Research Centre, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
- *Correspondence: Vyacheslav Yurchenko, ; Igor B. Rogozin,
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
- *Correspondence: Vyacheslav Yurchenko, ; Igor B. Rogozin,
| |
Collapse
|
12
|
Matsen FA, Ralph PL. Enabling Inference for Context-Dependent Models of Mutation by Bounding the Propagation of Dependency. J Comput Biol 2022; 29:802-824. [PMID: 35776513 PMCID: PMC9419934 DOI: 10.1089/cmb.2021.0644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Although the rates at which positions in the genome mutate are known to depend not only on the nucleotide to be mutated, but also on neighboring nucleotides, it remains challenging to do phylogenetic inference using models of context-dependent mutation. In these models, the effects of one mutation may in principle propagate to faraway locations, making it difficult to compute exact likelihoods. This article shows how to use bounds on the propagation of dependency to compute likelihoods of mutation of a given segment of genome by marginalizing over sufficiently long flanking sequence. This can be used for maximum likelihood or Bayesian inference. Protocols examining residuals and iterative model refinement are also discussed. Tools for efficiently working with these models are provided in an R package, which could be used in other applications. The method is used to examine context dependence of mutations since the common ancestor of humans and chimpanzee.
Collapse
Affiliation(s)
- Frederick A. Matsen
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Genome Sciences, and University of Washington, Seattle, Washington, USA
- Department of Statistics, University of Washington, Seattle, Washington, USA
- Howard Hughes Medical Institute, Chevy Chase, Maryland, USA
| | - Peter L. Ralph
- Departments of Biology and Mathematics, Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
13
|
Póti Á, Szikriszt B, Gervai JZ, Chen D, Szüts D. Characterisation of the spectrum and genetic dependence of collateral mutations induced by translesion DNA synthesis. PLoS Genet 2022; 18:e1010051. [PMID: 35130276 PMCID: PMC8870599 DOI: 10.1371/journal.pgen.1010051] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 02/24/2022] [Accepted: 01/21/2022] [Indexed: 11/18/2022] Open
Abstract
Translesion DNA synthesis (TLS) is a fundamental damage bypass pathway that utilises specialised polymerases with relaxed template specificity to achieve replication through damaged DNA. Misinsertions by low fidelity TLS polymerases may introduce additional mutations on undamaged DNA near the original lesion site, which we termed collateral mutations. In this study, we used whole genome sequencing datasets of chicken DT40 and several human cell lines to obtain evidence for collateral mutagenesis in higher eukaryotes. We found that cisplatin and UVC radiation frequently induce close mutation pairs within 25 base pairs that consist of an adduct-associated primary and a downstream collateral mutation, and genetically linked their formation to TLS activity involving PCNA ubiquitylation and polymerase κ. PCNA ubiquitylation was also indispensable for close mutation pairs observed amongst spontaneously arising base substitutions in cell lines with disrupted homologous recombination. Collateral mutation pairs were also found in melanoma genomes with evidence of UV exposure. We showed that collateral mutations frequently copy the upstream base, and extracted a base substitution signature that describes collateral mutagenesis in the presented dataset regardless of the primary mutagenic process. Using this mutation signature, we showed that collateral mutagenesis creates approximately 10–20% of non-paired substitutions as well, underscoring the importance of the process. DNA base substitutions are the most common form of genomic mutations, formed both spontaneously and in response to environmental mutagens. One of the main mechanisms of base substitution mutagenesis is translesion synthesis, a process that relies on specialised DNA polymerases to replicate damaged DNA templates. In addition to incorrect base insertions at the site of lesions in the template, translesion polymerases may also generate ‘collateral’ mutations away from the lesion due to their lower accuracy in selecting the correct incoming nucleotide. In this study, we surveyed the whole genome sequence of experimental cell clones to examine the extent and genetic dependence of collateral mutagenesis in higher eukaryotes. Looking for close mutation pairs, we found that collateral mutations frequently occur near primary lesions generated by cisplatin or ultraviolet radiation in chicken and human cells, but are restricted to a short distance of approximately 25 base pairs. By analysing their sequence context, we showed that collateral mutations can also occur near correctly bypassed primary lesions and may be responsible for a considerable proportion of all base substitution mutations.
Collapse
Affiliation(s)
- Ádám Póti
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Bernadett Szikriszt
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | | | - Dan Chen
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Dávid Szüts
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- * E-mail:
| |
Collapse
|
14
|
Bergeron LA, Besenbacher S, Turner T, Versoza CJ, Wang RJ, Price AL, Armstrong E, Riera M, Carlson J, Chen HY, Hahn MW, Harris K, Kleppe AS, López-Nandam EH, Moorjani P, Pfeifer SP, Tiley GP, Yoder AD, Zhang G, Schierup MH. The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife 2022; 11:73577. [PMID: 35018888 PMCID: PMC8830884 DOI: 10.7554/elife.73577] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 01/11/2022] [Indexed: 11/13/2022] Open
Abstract
In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a ‘Mutationathon,’ a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.
Collapse
Affiliation(s)
- Lucie A Bergeron
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Søren Besenbacher
- Department of Molecular Medicine (MOMA), Aarhus University, Aarhus N, Denmark
| | - Tychele Turner
- Department of Genetics, Washington University in St. Louis, Saint Louis, United States
| | - Cyril J Versoza
- Center for Evolution and Medicine, Arizona State University, Tempe, United States
| | - Richard J Wang
- Department of Biology, Indiana University, Bloomington, United States
| | - Alivia Lee Price
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Ellie Armstrong
- Department of Biology, Stanford University, Stanford, United States
| | - Meritxell Riera
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Jedidiah Carlson
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Hwei-Yen Chen
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, United States
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States
| | | | | | - Priya Moorjani
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
| | - Susanne P Pfeifer
- School of Life Sciences, Arizona State University, Tempe, United States
| | - George P Tiley
- Department of Biology, Duke University, Durham, United States
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, United States
| | - Guojie Zhang
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
15
|
Sepúlveda-Yáñez JH, Alvarez Saravia D, Pilzecker B, van Schouwenburg PA, van den Burg M, Veelken H, Navarrete MA, Jacobs H, Koning MT. Tandem Substitutions in Somatic Hypermutation. Front Immunol 2022; 12:807015. [PMID: 35069591 PMCID: PMC8781386 DOI: 10.3389/fimmu.2021.807015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/16/2021] [Indexed: 11/13/2022] Open
Abstract
Upon antigen recognition, activation-induced cytosine deaminase initiates affinity maturation of the B-cell receptor by somatic hypermutation (SHM) through error-prone DNA repair pathways. SHM typically creates single nucleotide substitutions, but tandem substitutions may also occur. We investigated incidence and sequence context of tandem substitutions by massive parallel sequencing of V(D)J repertoires in healthy human donors. Mutation patterns were congruent with SHM-derived single nucleotide mutations, delineating initiation of the tandem substitution by AID. Tandem substitutions comprised 5,7% of AID-induced mutations. The majority of tandem substitutions represents single nucleotide juxtalocations of directly adjacent sequences. These observations were confirmed in an independent cohort of healthy donors. We propose a model where tandem substitutions are predominantly generated by translesion synthesis across an apyramidinic site that is typically created by UNG. During replication, apyrimidinic sites transiently adapt an extruded configuration, causing skipping of the extruded base. Consequent strand decontraction leads to the juxtalocation, after which exonucleases repair the apyramidinic site and any directly adjacent mismatched base pairs. The mismatch repair pathway appears to account for the remainder of tandem substitutions. Tandem substitutions may enhance affinity maturation and expedite the adaptive immune response by overcoming amino acid codon degeneracies or mutating two adjacent amino acid residues simultaneously.
Collapse
Affiliation(s)
- Julieta H Sepúlveda-Yáñez
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
- School of Medicine, University of Magallanes, Punta Arenas, Chile
| | | | - Bas Pilzecker
- Department of Tumor Immunology, Radboud Institute for Molecular Life Sciences, Nijmegen, Netherlands
- Division of Tumor Biology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | | | - Mirjam van den Burg
- Department of Pediatrics, Leiden University Medical Center, Leiden, Netherlands
| | - Hendrik Veelken
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
| | | | - Heinz Jacobs
- Division of Tumor Biology and Immunology, Netherlands Cancer Institute, Amsterdam, Netherlands
| | - Marvyn T Koning
- Department of Hematology, Leiden University Medical Center, Leiden, Netherlands
| |
Collapse
|
16
|
Protein innovation through template switching in the Saccharomyces cerevisiae lineage. Sci Rep 2021; 11:22558. [PMID: 34799587 PMCID: PMC8604942 DOI: 10.1038/s41598-021-01736-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 10/27/2021] [Indexed: 11/08/2022] Open
Abstract
DNA polymerase template switching between short, non-identical inverted repeats (IRs) is a genetic mechanism that leads to the homogenization of IR arms and to IR spacer inversion, which cause multinucleotide mutations (MNMs). It is unknown if and how template switching affects gene evolution. In this study, we performed a phylogenetic analysis to determine the effect of template switching between IR arms on coding DNA of Saccharomyces cerevisiae. To achieve this, perfect IRs that co-occurred with MNMs between a strain and its parental node were identified in S. cerevisiae strains. We determined that template switching introduced MNMs into 39 protein-coding genes through S. cerevisiae evolution, resulting in both arm homogenization and inversion of the IR spacer. These events in turn resulted in nonsynonymous substitutions and up to five neighboring amino acid replacements in a single gene. The study demonstrates that template switching is a powerful generator of multiple substitutions within codons. Additionally, some template switching events occurred more than once during S. cerevisiae evolution. Our findings suggest that template switching constitutes a general mutagenic mechanism that results in both nonsynonymous substitutions and parallel evolution, which are traditionally considered as evidence for positive selection, without the need for adaptive explanations.
Collapse
|
17
|
Jiang P, Ollodart AR, Sudhesh V, Herr AJ, Dunham MJ, Harris K. A modified fluctuation assay reveals a natural mutator phenotype that drives mutation spectrum variation within Saccharomyces cerevisiae. eLife 2021; 10:68285. [PMID: 34523420 PMCID: PMC8497059 DOI: 10.7554/elife.68285] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 09/14/2021] [Indexed: 12/23/2022] Open
Abstract
Although studies of Saccharomyces cerevisiae have provided many insights into mutagenesis and DNA repair, most of this work has focused on a few laboratory strains. Much less is known about the phenotypic effects of natural variation within S. cerevisiae’s DNA repair pathways. Here, we use natural polymorphisms to detect historical mutation spectrum differences among several wild and domesticated S. cerevisiae strains. To determine whether these differences are likely caused by genetic mutation rate modifiers, we use a modified fluctuation assay with a CAN1 reporter to measure de novo mutation rates and spectra in 16 of the analyzed strains. We measure a 10-fold range of mutation rates and identify two strains with distinctive mutation spectra. These strains, known as AEQ and AAR, come from the panel’s ‘Mosaic beer’ clade and share an enrichment for C > A mutations that is also observed in rare variation segregating throughout the genomes of several Mosaic beer and Mixed origin strains. Both AEQ and AAR are haploid derivatives of the diploid natural isolate CBS 1782, whose rare polymorphisms are enriched for C > A as well, suggesting that the underlying mutator allele is likely active in nature. We use a plasmid complementation test to show that AAR and AEQ share a mutator allele in the DNA repair gene OGG1, which excises 8-oxoguanine lesions that can cause C > A mutations if left unrepaired.
Collapse
Affiliation(s)
- Pengyao Jiang
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Anja R Ollodart
- Department of Genome Sciences, University of Washington, Seattle, United States.,Molecular and Cellular Biology Program, University of Washington, Seattle, United States
| | - Vidha Sudhesh
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Alan J Herr
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, United States
| | - Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Seattle, United States
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States.,Department of Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, United States
| |
Collapse
|
18
|
Kritioti E, Theodosiou A, Parpaite T, Alexandrou A, Nicolaou N, Papaevripidou I, Séjourné N, Coste B, Christophidou-Anastasiadou V, Tanteles GA, Sismani C. Unravelling the genetic causes of multiple malformation syndromes: A whole exome sequencing study of the Cypriot population. PLoS One 2021; 16:e0253562. [PMID: 34324503 PMCID: PMC8320927 DOI: 10.1371/journal.pone.0253562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/08/2021] [Indexed: 11/19/2022] Open
Abstract
Multiple malformation syndromes (MMS) belong to a group of genetic disorders characterised by neurodevelopmental anomalies and congenital malformations. Here we explore for the first time the genetic aetiology of MMS using whole-exome sequencing (WES) in undiagnosed patients from the Greek-Cypriot population after prior extensive diagnostics workup including karyotype and array-CGH. A total of 100 individuals (37 affected), from 32 families were recruited and family-based WES was applied to detect causative single-nucleotide variants (SNVs) and indels. A genetic diagnosis was reported for 16 MMS patients (43.2%), with 10/17 (58.8%) of the findings being novel. All autosomal dominant findings occurred de novo. Functional studies were also performed to elucidate the molecular mechanism relevant to the abnormal phenotypes, in cases where the clinical significance of the findings was unclear. The 17 variants identified in our cohort were located in 14 genes (PCNT, UBE3A, KAT6A, SPR, POMGNT1, PIEZO2, PXDN, KDM6A, PHIP, HECW2, TFAP2A, CNOT3, AGTPBP1 and GAMT). This study has highlighted the efficacy of WES through the high detection rate (43.2%) achieved for a challenging category of undiagnosed patients with MMS compared to other conventional diagnostic testing methods (10-20% for array-CGH and ~3% for G-banding karyotype analysis). As a result, family-based WES could potentially be considered as a first-tier cost effective diagnostic test for patients with MMS that facilitates better patient management, prognosis and offer accurate recurrence risks to the families.
Collapse
Affiliation(s)
- Evie Kritioti
- Department of Cytogenetics and Genomics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
- Clinical Genetics Clinic, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Athina Theodosiou
- Department of Cytogenetics and Genomics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | | | - Angelos Alexandrou
- Department of Cytogenetics and Genomics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Nayia Nicolaou
- Clinical Genetics Clinic, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Ioannis Papaevripidou
- Department of Cytogenetics and Genomics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Nina Séjourné
- Aix Marseille Université, CNRS, LNC-UMR 7291, Marseille, France
| | - Bertrand Coste
- Aix Marseille Université, CNRS, LNC-UMR 7291, Marseille, France
| | | | - George A. Tanteles
- Clinical Genetics Clinic, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Carolina Sismani
- Department of Cytogenetics and Genomics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| |
Collapse
|
19
|
Bohutínská M, Handrick V, Yant L, Schmickl R, Kolář F, Bomblies K, Paajanen P. De Novo Mutation and Rapid Protein (Co-)evolution during Meiotic Adaptation in Arabidopsis arenosa. Mol Biol Evol 2021; 38:1980-1994. [PMID: 33502506 PMCID: PMC8097281 DOI: 10.1093/molbev/msab001] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A sudden shift in environment or cellular context necessitates rapid adaptation. A dramatic example is genome duplication, which leads to polyploidy. In such situations, the waiting time for new mutations might be prohibitive; theoretical and empirical studies suggest that rapid adaptation will largely rely on standing variation already present in source populations. Here, we investigate the evolution of meiosis proteins in Arabidopsis arenosa, some of which were previously implicated in adaptation to polyploidy, and in a diploid, habitat. A striking and unexplained feature of prior results was the large number of amino acid changes in multiple interacting proteins, especially in the relatively young tetraploid. Here, we investigate whether selection on meiosis genes is found in other lineages, how the polyploid may have accumulated so many differences, and whether derived variants were selected from standing variation. We use a range-wide sample of 145 resequenced genomes of diploid and tetraploid A. arenosa, with new genome assemblies. We confirmed signals of positive selection in the polyploid and diploid lineages they were previously reported in and find additional meiosis genes with evidence of selection. We show that the polyploid lineage stands out both qualitatively and quantitatively. Compared with diploids, meiosis proteins in the polyploid have more amino acid changes and a higher proportion affecting more strongly conserved sites. We find evidence that in tetraploids, positive selection may have commonly acted on de novo mutations. Several tests provide hints that coevolution, and in some cases, multinucleotide mutations, might contribute to rapid accumulation of changes in meiotic proteins.
Collapse
Affiliation(s)
- Magdalena Bohutínská
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Vinzenz Handrick
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Levi Yant
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Roswitha Schmickl
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Filip Kolář
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic.,Department of Botany, University of Innsbruck, Innsbruck, Austria
| | - Kirsten Bomblies
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom.,Plant Evolutionary Genetics, Department of Biology, Institute of Molecular Plant Biology, ETH Zürich, Zurich, Switzerland
| | - Pirita Paajanen
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| |
Collapse
|
20
|
Norn C, André I, Theobald DL. A thermodynamic model of protein structure evolution explains empirical amino acid substitution matrices. Protein Sci 2021; 30:2057-2068. [PMID: 34218472 PMCID: PMC8442976 DOI: 10.1002/pro.4155] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 06/25/2021] [Accepted: 06/29/2021] [Indexed: 12/30/2022]
Abstract
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.
Collapse
Affiliation(s)
- Christoffer Norn
- Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Ingemar André
- Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Douglas L Theobald
- Biochemistry Department, Brandeis University, Waltham, Massachusetts, USA
| |
Collapse
|
21
|
Moreira A, Croze M, Delehelle F, Cussat-Blanc S, Luga H, Mollereau C, Balaresque P. Hearing Sensitivity of Primates: Recurrent and Episodic Positive Selection in Hair Cells and Stereocilia Protein-Coding Genes. Genome Biol Evol 2021; 13:6302699. [PMID: 34137817 PMCID: PMC8358225 DOI: 10.1093/gbe/evab133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/06/2021] [Indexed: 12/29/2022] Open
Abstract
The large spectrum of hearing sensitivity observed in primates results from the impact of environmental and behavioral pressures to optimize sound perception and localization. Although evidence of positive selection in auditory genes has been detected in mammals including in Hominoids, selection has never been investigated in other primates. We analyzed 123 genes highly expressed in the inner ear of 27 primate species and tested to what extent positive selection may have shaped these genes in the order Primates tree. We combined both site and branch-site tests to obtain a comprehensive picture of the positively selected genes (PSGs) involved in hearing sensitivity, and drew a detailed description of the most affected branches in the tree. We chose a conservative approach, and thus focused on confounding factors potentially affecting PSG signals (alignment, GC-biased gene conversion, duplications, heterogeneous sequencing qualities). Using site tests, we showed that around 12% of these genes are PSGs, an α selection value consistent with average human genome estimates (10-15%). Using branch-site tests, we showed that the primate tree is heterogeneously affected by positive selection, with the black snub-nosed monkey, the bushbaby, and the orangutan, being the most impacted branches. A large proportion of these genes is inclined to shape hair cells and stereocilia, which are involved in the mechanotransduction process, known to influence frequency perception. Adaptive selection, and more specifically recurrent adaptive evolution, could have acted in parallel on a set of genes (ADGRV1, USH2A, PCDH15, PTPRQ, and ATP8A2) involved in stereocilia growth and the whole complex of bundle links connecting them, in species across different habitats, including high altitude and nocturnal environments.
Collapse
Affiliation(s)
- Andreia Moreira
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France.,Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Myriam Croze
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| | - Franklin Delehelle
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France.,Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Sylvain Cussat-Blanc
- Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Hervé Luga
- Institut de Recherche en Informatique de Toulouse (IRIT), CNRS UMR5505, Université Toulouse III Paul Sabatier, France
| | - Catherine Mollereau
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| | - Patricia Balaresque
- Anthropologie Moléculaire et Imagerie de Synthèse (AMIS), Faculté de Médecine Purpan, CNRS UMR5288, Université de Toulouse, Université Toulouse III Paul Sabatier, France
| |
Collapse
|
22
|
Sandler G, Wright SI, Agrawal AF. Patterns and Causes of Signed Linkage Disequilibria in Flies and Plants. Mol Biol Evol 2021; 38:4310-4321. [PMID: 34097067 PMCID: PMC8476167 DOI: 10.1093/molbev/msab169] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Most empirical studies of linkage disequilibrium (LD) study its magnitude, ignoring its sign. Here, we examine patterns of signed LD in two population genomic data sets, one from Capsella grandiflora and one from Drosophila melanogaster. We consider how processes such as drift, admixture, Hill–Robertson interference, and epistasis may contribute to these patterns. We report that most types of mutations exhibit positive LD, particularly, if they are predicted to be less deleterious. We show with simulations that this pattern arises easily in a model of admixture or distance-biased mating, and that genome-wide differences across site types are generally expected due to differences in the strength of purifying selection even in the absence of epistasis. We further explore how signed LD decays on a finer scale, showing that loss of function mutations exhibit particularly positive LD across short distances, a pattern consistent with intragenic antagonistic epistasis. Controlling for genomic distance, signed LD in C. grandiflora decays faster within genes, compared with between genes, likely a by-product of frequent recombination in gene promoters known to occur in plant genomes. Finally, we use information from published biological networks to explore whether there is evidence for negative synergistic epistasis between interacting radical missense mutations. In D. melanogaster networks, we find a modest but significant enrichment of negative LD, consistent with the possibility of intranetwork negative synergistic epistasis.
Collapse
Affiliation(s)
- George Sandler
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada.,Center for Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada
| | - Aneil F Agrawal
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada.,Center for Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada
| |
Collapse
|
23
|
Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes. PLoS One 2021; 16:e0248337. [PMID: 33711070 PMCID: PMC7954308 DOI: 10.1371/journal.pone.0248337] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/24/2021] [Indexed: 01/03/2023] Open
Abstract
Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.
Collapse
|
24
|
Walker CR, Scally A, De Maio N, Goldman N. Short-range template switching in great ape genomes explored using pair hidden Markov models. PLoS Genet 2021; 17:e1009221. [PMID: 33651813 PMCID: PMC7954356 DOI: 10.1371/journal.pgen.1009221] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/12/2021] [Accepted: 02/10/2021] [Indexed: 12/14/2022] Open
Abstract
Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes’ genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons. DNA replication is an imperfect process which causes the mutations that give rise to genetic diversity during the evolution of genomes. While many mutations are independent, single-nucleotide substitutions or small insertions and deletions, some mutations arise as nonindependent clusters of substitutions and larger scale chromosomal rearrangements. Large-scale rearrangements (also called structural variants) in particular can have a profound impact on genome evolution and contribute to both germline and somatic disease in humans. The replication-based mechanisms underlying structural variation typically involve a polymerase switch event in which a large segment of DNA is copied using a template from an alternate location in the genome. Methods for identifying these template switch mutations lack the power to detect smaller scale rearrangements which can arise through the same replication-based pathways. Here we outline a model which can detect and assess the statistical significance of such small-scale template switches within their evolutionary context. We show that these events are widespread in the evolution of great apes and that the genomic features associated with these small-scale rearrangements are similar to those of large-scale structural variants.
Collapse
Affiliation(s)
- Conor R. Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
- * E-mail:
| |
Collapse
|
25
|
Jones CT, Youssef N, Susko E, Bielawski JP. A Phenotype-Genotype Codon Model for Detecting Adaptive Evolution. Syst Biol 2021; 69:722-738. [PMID: 31730199 DOI: 10.1093/sysbio/syz075] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 11/09/2019] [Accepted: 11/11/2019] [Indexed: 01/03/2023] Open
Abstract
A central objective in biology is to link adaptive evolution in a gene to structural and/or functional phenotypic novelties. Yet most analytic methods make inferences mainly from either phenotypic data or genetic data alone. A small number of models have been developed to infer correlations between the rate of molecular evolution and changes in a discrete or continuous life history trait. But such correlations are not necessarily evidence of adaptation. Here, we present a novel approach called the phenotype-genotype branch-site model (PG-BSM) designed to detect evidence of adaptive codon evolution associated with discrete-state phenotype evolution. An episode of adaptation is inferred under standard codon substitution models when there is evidence of positive selection in the form of an elevation in the nonsynonymous-to-synonymous rate ratio $\omega$ to a value $\omega > 1$. As it is becoming increasingly clear that $\omega > 1$ can occur without adaptation, the PG-BSM was formulated to infer an instance of adaptive evolution without appealing to evidence of positive selection. The null model makes use of a covarion-like component to account for general heterotachy (i.e., random changes in the evolutionary rate at a site over time). The alternative model employs samples of the phenotypic evolutionary history to test for phenomenological patterns of heterotachy consistent with specific mechanisms of molecular adaptation. These include 1) a persistent increase/decrease in $\omega$ at a site following a change in phenotype (the pattern) consistent with an increase/decrease in the functional importance of the site (the mechanism); and 2) a transient increase in $\omega$ at a site along a branch over which the phenotype changed (the pattern) consistent with a change in the site's optimal amino acid (the mechanism). Rejection of the null is followed by post hoc analyses to identify sites with strongest evidence for adaptation in association with changes in the phenotype as well as the most likely evolutionary history of the phenotype. Simulation studies based on a novel method for generating mechanistically realistic signatures of molecular adaptation show that the PG-BSM has good statistical properties. Analyses of real alignments show that site patterns identified post hoc are consistent with the specific mechanisms of adaptation included in the alternate model. Further simulation studies show that the covarion-like component of the PG-BSM plays a crucial role in mitigating recently discovered statistical pathologies associated with confounding by accounting for heterotachy-by-any-cause. [Adaptive evolution; branch-site model; confounding; mutation-selection; phenotype-genotype.].
Collapse
Affiliation(s)
- Christopher T Jones
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Noor Youssef
- Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| | - Joseph P Bielawski
- Department of Mathematics and Statistics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Department of Biology, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, 1233 LeMarchant Street, B3H 4R2, Halifax, Nova Scotia, Canada
| |
Collapse
|
26
|
Cohen ZP, Brevik K, Chen YH, Hawthorne DJ, Weibel BD, Schoville SD. Elevated rates of positive selection drive the evolution of pestiferousness in the Colorado potato beetle (Leptinotarsa decemlineata, Say). Mol Ecol 2020; 30:237-254. [PMID: 33095936 DOI: 10.1111/mec.15703] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 09/28/2020] [Accepted: 10/15/2020] [Indexed: 12/16/2022]
Abstract
Contextualizing evolutionary history and identifying genomic features of an insect that might contribute to its pest status is important in developing early detection and control tactics. In order to understand the evolution of pestiferousness, which we define as the accumulation of traits that contribute to an insect population's success in an agroecosystem, we tested the importance of known genomic properties associated with rapid adaptation in the Colorado potato beetle (CPB), Leptinotarsa decemlineata Say. Within the leaf beetle genus Leptinotarsa, only CPB, and a few populations therein, has risen to pest status on cultivated nightshades, Solanum. Using whole genomes from ten closely related Leptinotarsa species native to the United States, we reconstructed a high-quality species tree and used this phylogenetic framework to assess evolutionary patterns in four genomic features of rapid adaptation: standing genetic variation, gene family expansion and contraction, transposable element abundance and location, and positive selection at protein-coding genes. Throughout approximately 20 million years of history, Leptinotarsa species show little evidence of gene family turnover and transposable element variation. However, there is a clear pattern of CPB experiencing higher rates of positive selection on protein-coding genes. We determine that these rates are associated with greater standing genetic variation due to larger effective population size, which supports the theory that the demographic history contributes to rates of protein evolution. Furthermore, we identify a suite of coding genes under positive selection that are putatively associated with pestiferousness in the Colorado potato beetle lineage. They are involved in the biological processes of xenobiotic detoxification, chemosensation and hormone function.
Collapse
Affiliation(s)
- Zachary P Cohen
- Department of Entomology, University of Wisconsin-Madison, Madison, WI, USA
| | - Kristian Brevik
- Department of Plant and Soil Sciences, University of Vermont, Burlington, VT, USA
| | - Yolanda H Chen
- Department of Plant and Soil Sciences, University of Vermont, Burlington, VT, USA
| | - David J Hawthorne
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Benjamin D Weibel
- Department of Entomology, University of Wisconsin-Madison, Madison, WI, USA
| | - Sean D Schoville
- Department of Entomology, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
27
|
Sackton TB. Studying Natural Selection in the Era of Ubiquitous Genomes. Trends Genet 2020; 36:792-803. [DOI: 10.1016/j.tig.2020.07.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 07/10/2020] [Accepted: 07/13/2020] [Indexed: 01/15/2023]
|
28
|
Comparative Analysis of Sequence Polymorphism in Complete Organelle Genomes of the ‘Golden Tide’ Seaweed Sargassum horneri between Korean and Chinese Forms. SUSTAINABILITY 2020. [DOI: 10.3390/su12187280] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Drifting and inundating brown seaweed Sargassum horneri biomass is called “golden tide”, as it resembles golden massive algal blooms like green tides. This phenomenon occurs globally and its serious ecological impacts on coastal ecosystems have recently begun to be paid attention to. In the present study, by sequencing whole organelle genomes of Korean indigenous S. horneri, we aimed to develop novel molecular markers that can be used for differentiating indigenous from nonindigenous individuals. To this end, we analyzed sequence polymorphisms in mitochondrial (mt) and chloroplast (cp) genomes of two Korean benthic samples in comparison to Chinese ones as a reference. We mapped mt genomes of 34,620~34,628 bp and cp genomes of 123,982~124,053 bp for the Korean samples. In comparative analyses, mtDNA cytochrome c oxidase subunit II (cox2) gene showed the highest number of single nucleotide polymorphisms (SNPs) between Korean and Chinese individuals. NADH dehydrogenase subunit 7 (Nad7)-proline tRNA (trnP) intergenic spacer (IGS) in the mt genome showed a 14 bp insertion or deletion (indel) mutation. For the cp genome, we found a total of 54 SNPs, but its overall evolution rate was approximately four-fold lower than the mt genome. Interestingly, analysis of Ka/Ks ratio in the cp genome revealed a signature of positive selection on several genes, although only negative selection prevalent in mt genome. The ‘candidate’ genetic markers that we found can be applied to discriminate between Korean indigenous and nonindigenous individuals. This study will assist in developing a molecular-based early detection method for effectively managing nonindigenous S. horneri in Korean waters.
Collapse
|
29
|
Low Base-Substitution Mutation Rate but High Rate of Slippage Mutations in the Sequence Repeat-Rich Genome of Dictyostelium discoideum. G3-GENES GENOMES GENETICS 2020; 10:3445-3452. [PMID: 32732307 PMCID: PMC7466956 DOI: 10.1534/g3.120.401578] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We describe the rate and spectrum of spontaneous mutations for the social amoeba Dictyostelium discoideum, a key model organism in molecular, cellular, evolutionary and developmental biology. Whole-genome sequencing of 37 mutation accumulation lines of D. discoideum after an average of 1,500 cell divisions yields a base-substitution mutation rate of 2.47 × 10−11 per site per generation, substantially lower than that of most eukaryotic and prokaryotic organisms, and of the same order of magnitude as in the ciliates Paramecium tetraurelia and Tetrahymena thermophila. Known for its high genomic AT content and abundance of simple sequence repeats, we observe that base-substitution mutations in D. discoideum are highly A/T biased. This bias likely contributes both to the high genomic AT content and to the formation of simple sequence repeats in the AT-rich genome of Dictyostelium discoideum. In contrast to the situation in other surveyed unicellular eukaryotes, indel rates far exceed the base-substitution mutation rate in this organism with a high proportion of 3n indels, particularly in regions without simple sequence repeats. Like ciliates, D. discoideum has a large effective population size, reducing the power of random genetic drift, magnifying the effect of selection on replication fidelity, in principle allowing D. discoideum to evolve an extremely low base-substitution mutation rate.
Collapse
|
30
|
Estimation of the Genome-Wide Mutation Rate and Spectrum in the Archaeal Species Haloferax volcanii. Genetics 2020; 215:1107-1116. [PMID: 32513815 DOI: 10.1534/genetics.120.303299] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 05/26/2020] [Indexed: 12/26/2022] Open
Abstract
Organisms adapted to life in extreme habitats (extremophiles) can further our understanding of the mechanisms of genetic stability, particularly replication and repair. Despite the harsh environmental conditions they endure, these extremophiles represent a great deal of the Earth's biodiversity. Here, for the first time in a member of the archaeal domain, we report a genome-wide assay of spontaneous mutations in the halophilic species Haloferax volcanii using a direct and unbiased method: mutation accumulation experiments combined with deep whole-genome sequencing. H. volcanii is a key model organism not only for the study of halophilicity, but also for archaeal biology in general. Our methods measure the genome-wide rate, spectrum, and spatial distribution of spontaneous mutations. The estimated base substitution rate of 3.15 × 10-10 per site per generation, or 0.0012 per genome per generation, is similar to the value found in mesophilic prokaryotes (optimal growth at ∼20-45°). This study contributes to a comprehensive phylogenetic view of how evolutionary forces and molecular mechanisms shape the rate and molecular spectrum of mutations across the tree of life.
Collapse
|
31
|
Wang Q, Pierce-Hoffman E, Cummings BB, Alföldi J, Francioli LC, Gauthier LD, Hill AJ, O'Donnell-Luria AH, Karczewski KJ, MacArthur DG. Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes. Nat Commun 2020; 11:2539. [PMID: 32461613 PMCID: PMC7253413 DOI: 10.1038/s41467-019-12438-5] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
Collapse
Affiliation(s)
- Qingbo Wang
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, 02115, USA
| | - Emma Pierce-Hoffman
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Beryl B Cummings
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Biomedical and Biological Sciences, Harvard Medical School, Boston, MA, 02115, USA
| | - Jessica Alföldi
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Laurent C Francioli
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Laura D Gauthier
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Andrew J Hill
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Anne H O'Donnell-Luria
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA.
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Australia.
| |
Collapse
|
32
|
The Tempo and Mode of Angiosperm Mitochondrial Genome Divergence Inferred from Intraspecific Variation in Arabidopsis thaliana. G3-GENES GENOMES GENETICS 2020; 10:1077-1086. [PMID: 31964685 PMCID: PMC7056966 DOI: 10.1534/g3.119.401023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The mechanisms of sequence divergence in angiosperm mitochondrial genomes have long been enigmatic. In particular, it is difficult to reconcile the rapid divergence of intergenic regions that can make non-coding sequences almost unrecognizable even among close relatives with the unusually high levels of sequence conservation found in genic regions. It has been hypothesized that different mutation and repair mechanisms act on genic and intergenic sequences or alternatively that mutational input is relatively constant but that selection has strikingly different effects on these respective regions. To test these alternative possibilities, we analyzed mtDNA divergence within Arabidopsis thaliana, including variants from the 1001 Genomes Project and changes accrued in published mutation accumulation (MA) lines. We found that base-substitution frequencies are relatively similar for intergenic regions and synonymous sites in coding regions, whereas indel and nonsynonymous substitutions rates are greatly depressed in coding regions, supporting a conventional model in which mutation/repair mechanisms are consistent throughout the genome but differentially filtered by selection. Most types of sequence and structural changes were undetectable in 10-generation MA lines, but we found significant shifts in relative copy number across mtDNA regions for lines grown under stressed vs. benign conditions. We confirmed quantitative variation in copy number across the A. thaliana mitogenome using both whole-genome sequencing and droplet digital PCR, further undermining the classic but oversimplified model of a circular angiosperm mtDNA structure. Our results suggest that copy number variation is one of the most fluid features of angiosperm mitochondrial genomes.
Collapse
|
33
|
Mugal CF, Kutschera VE, Botero-Castro F, Wolf JBW, Kaj I. Polymorphism Data Assist Estimation of the Nonsynonymous over Synonymous Fixation Rate Ratio ω for Closely Related Species. Mol Biol Evol 2020; 37:260-279. [PMID: 31504782 PMCID: PMC6984366 DOI: 10.1093/molbev/msz203] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The ratio of nonsynonymous over synonymous sequence divergence, dN/dS, is a widely used estimate of the nonsynonymous over synonymous fixation rate ratio ω, which measures the extent to which natural selection modulates protein sequence evolution. Its computation is based on a phylogenetic approach and computes sequence divergence of protein-coding DNA between species, traditionally using a single representative DNA sequence per species. This approach ignores the presence of polymorphisms and relies on the indirect assumption that new mutations fix instantaneously, an assumption which is generally violated and reasonable only for distantly related species. The violation of the underlying assumption leads to a time-dependence of sequence divergence, and biased estimates of ω in particular for closely related species, where the contribution of ancestral and lineage-specific polymorphisms to sequence divergence is substantial. We here use a time-dependent Poisson random field model to derive an analytical expression of dN/dS as a function of divergence time and sample size. We then extend our framework to the estimation of the proportion of adaptive protein evolution α. This mathematical treatment enables us to show that the joint usage of polymorphism and divergence data can assist the inference of selection for closely related species. Moreover, our analytical results provide the basis for a protocol for the estimation of ω and α for closely related species. We illustrate the performance of this protocol by studying a population data set of four corvid species, which involves the estimation of ω and α at different time-scales and for several choices of sample sizes.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Science for Life Laboratory, Stockholm University, Stockholm, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Fidel Botero-Castro
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Ingemar Kaj
- Department of Mathematics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
34
|
Belinky F, Sela I, Rogozin IB, Koonin EV. Crossing fitness valleys via double substitutions within codons. BMC Biol 2019; 17:105. [PMID: 31842858 PMCID: PMC6916188 DOI: 10.1186/s12915-019-0727-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 11/20/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
35
|
Beichman AC, Koepfli KP, Li G, Murphy W, Dobrynin P, Kliver S, Tinker MT, Murray MJ, Johnson J, Lindblad-Toh K, Karlsson EK, Lohmueller KE, Wayne RK. Aquatic Adaptation and Depleted Diversity: A Deep Dive into the Genomes of the Sea Otter and Giant Otter. Mol Biol Evol 2019; 36:2631-2655. [PMID: 31212313 PMCID: PMC7967881 DOI: 10.1093/molbev/msz101] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Despite its recent invasion into the marine realm, the sea otter (Enhydra lutris) has evolved a suite of adaptations for life in cold coastal waters, including limb modifications and dense insulating fur. This uniquely dense coat led to the near-extinction of sea otters during the 18th-20th century fur trade and an extreme population bottleneck. We used the de novo genome of the southern sea otter (E. l. nereis) to reconstruct its evolutionary history, identify genes influencing aquatic adaptation, and detect signals of population bottlenecks. We compared the genome of the southern sea otter with the tropical freshwater-living giant otter (Pteronura brasiliensis) to assess common and divergent genomic trends between otter species, and with the closely related northern sea otter (E. l. kenyoni) to uncover population-level trends. We found signals of positive selection in genes related to aquatic adaptations, particularly limb development and polygenic selection on genes related to hair follicle development. We found extensive pseudogenization of olfactory receptor genes in both the sea otter and giant otter lineages, consistent with patterns of sensory gene loss in other aquatic mammals. At the population level, the southern sea otter and the northern sea otter showed extremely low genomic diversity, signals of recent inbreeding, and demographic histories marked by population declines. These declines may predate the fur trade and appear to have resulted in an increase in putatively deleterious variants that could impact the future recovery of the sea otter.
Collapse
Affiliation(s)
- Annabel C Beichman
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Gang Li
- College of Life Science, Shaanxi Normal University, Xi’an, Shaanxi, China
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX
| | - Pasha Dobrynin
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Sergei Kliver
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Martin T Tinker
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA
| | | | - Jeremy Johnson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Kerstin Lindblad-Toh
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Elinor K Karlsson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| |
Collapse
|
36
|
Dapper AL, Payseur BA. Molecular evolution of the meiotic recombination pathway in mammals. Evolution 2019; 73:2368-2389. [PMID: 31579931 DOI: 10.1111/evo.13850] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 09/07/2019] [Indexed: 02/06/2023]
Abstract
Meiotic recombination shapes evolution and helps to ensure proper chromosome segregation in most species that reproduce sexually. Recombination itself evolves, with species showing considerable divergence in the rate of crossing-over. However, the genetic basis of this divergence is poorly understood. Recombination events are produced via a complicated, but increasingly well-described, cellular pathway. We apply a phylogenetic comparative approach to a carefully selected panel of genes involved in the processes leading to crossovers-spanning double-strand break formation, strand invasion, the crossover/non-crossover decision, and resolution-to reconstruct the evolution of the recombination pathway in eutherian mammals and identify components of the pathway likely to contribute to divergence between species. Eleven recombination genes, predominantly involved in the stabilization of homologous pairing and the crossover/non-crossover decision, show evidence of rapid evolution and positive selection across mammals. We highlight TEX11 and associated genes involved in the synaptonemal complex and the early stages of the crossover/non-crossover decision as candidates for the evolution of recombination rate. Evolutionary comparisons to MLH1 count, a surrogate for the number of crossovers, reveal a positive correlation between genome-wide recombination rate and the rate of evolution at TEX11 across the mammalian phylogeny. Our results illustrate the power of viewing the evolution of recombination from a pathway perspective.
Collapse
Affiliation(s)
- Amy L Dapper
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin, 53706.,Department of Biological Sciences, Mississippi State University, Mississippi, 39762
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin, 53706
| |
Collapse
|
37
|
Goldmann JM, Veltman JA, Gilissen C. De Novo Mutations Reflect Development and Aging of the Human Germline. Trends Genet 2019; 35:828-839. [PMID: 31610893 DOI: 10.1016/j.tig.2019.08.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 08/15/2019] [Accepted: 08/28/2019] [Indexed: 01/19/2023]
Abstract
Human germline de novo mutations (DNMs) are both a driver of evolution and an important cause of genetic diseases. In the past few years, whole-genome sequencing (WGS) of parent-offspring trios has facilitated the large-scale detection and study of human DNMs, which has led to exciting discoveries. The overarching theme of all of these studies is that the DNMs of an individual are a complex mixture of mutations that arise through different biological processes acting at different times during human development and life.
Collapse
Affiliation(s)
- J M Goldmann
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA Nijmegen, The Netherlands
| | - J A Veltman
- Institute of Genetic Medicine, International Centre for Life, Newcastle University, Newcastle upon Tyne, UK; Department of Human Genetics, Donders Centre for Neuroscience, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA Nijmegen, The Netherlands
| | - C Gilissen
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein 10, 6525 GA Nijmegen, The Netherlands.
| |
Collapse
|
38
|
Richards JK, Stukenbrock EH, Carpenter J, Liu Z, Cowger C, Faris JD, Friesen TL. Local adaptation drives the diversification of effectors in the fungal wheat pathogen Parastagonospora nodorum in the United States. PLoS Genet 2019; 15:e1008223. [PMID: 31626626 PMCID: PMC6821140 DOI: 10.1371/journal.pgen.1008223] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 10/30/2019] [Accepted: 08/25/2019] [Indexed: 12/22/2022] Open
Abstract
Filamentous fungi rapidly evolve in response to environmental selection pressures in part due to their genomic plasticity. Parastagonospora nodorum, a fungal pathogen of wheat and causal agent of septoria nodorum blotch, responds to selection pressure exerted by its host, influencing the gain, loss, or functional diversification of virulence determinants, known as effector genes. Whole genome resequencing of 197 P. nodorum isolates collected from spring, durum, and winter wheat production regions of the United States enabled the examination of effector diversity and genomic regions under selection specific to geographically discrete populations. 1,026,859 SNPs/InDels were used to identify novel loci, as well as SnToxA and SnTox3 as factors in disease. Genes displaying presence/absence variation, predicted effector genes, and genes localized on an accessory chromosome had significantly higher pN/pS ratios, indicating a higher rate of sequence evolution. Population structure analyses indicated two P. nodorum populations corresponding to the Upper Midwest (Population 1) and Southern/Eastern United States (Population 2). Prevalence of SnToxA varied greatly between the two populations which correlated with presence of the host sensitivity gene Tsn1 in the most prevalent cultivars in the corresponding regions. Additionally, 12 and 5 candidate effector genes were observed to be under diversifying selection among isolates from Population 1 and 2, respectively, but under purifying selection or neutrally evolving in the opposite population. Selective sweep analysis revealed 10 and 19 regions that had recently undergone positive selection in Population 1 and 2, respectively, involving 92 genes in total. When comparing genes with and without presence/absence variation, those genes exhibiting this variation were significantly closer to transposable elements. Taken together, these results indicate that P. nodorum is rapidly adapting to distinct selection pressures unique to spring and winter wheat production regions by rapid adaptive evolution and various routes of genomic diversification, potentially facilitated through transposable element activity.
Collapse
Affiliation(s)
- Jonathan K. Richards
- Department of Plant Pathology and Crop Physiology, Louisiana State University Agricultural Center, Baton Rouge, Louisiana, United States of America
| | - Eva H. Stukenbrock
- Department of Environmental Genomics, Christian-Albrechts University of Kiel, Kiel, Germany
- Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Jessica Carpenter
- Department of Plant Pathology, North Dakota State University, Fargo, North Dakota, United States of America
| | - Zhaohui Liu
- Department of Plant Pathology, North Dakota State University, Fargo, North Dakota, United States of America
| | - Christina Cowger
- Plant Science Research Unit, USDA-ARS, Raleigh, North Carolina, United States of America
| | - Justin D. Faris
- Cereal Crops Research Unit, Edward T. Schaefer Agricultural Research Center, USDA-ARS, Fargo, North Dakota, United States of America
| | - Timothy L. Friesen
- Department of Plant Pathology, North Dakota State University, Fargo, North Dakota, United States of America
- Cereal Crops Research Unit, Edward T. Schaefer Agricultural Research Center, USDA-ARS, Fargo, North Dakota, United States of America
| |
Collapse
|
39
|
Delmont TO, Kiefl E, Kilinc O, Esen OC, Uysal I, Rappé MS, Giovannoni S, Eren AM. Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade. eLife 2019; 8:46497. [PMID: 31478833 PMCID: PMC6721796 DOI: 10.7554/elife.46497] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 08/13/2019] [Indexed: 12/14/2022] Open
Abstract
Members of the SAR11 order Pelagibacterales dominate the surface oceans. Their extensive diversity challenges emerging operational boundaries defined for microbial 'species' and complicates efforts of population genetics to study their evolution. Here, we employed single-amino acid variants (SAAVs) to investigate ecological and evolutionary forces that maintain the genomic heterogeneity within ubiquitous SAR11 populations we accessed through metagenomic read recruitment using a single isolate genome. Integrating amino acid and protein biochemistry with metagenomics revealed that systematic purifying selection against deleterious variants governs non-synonymous variation among very closely related populations of SAR11. SAAVs partitioned metagenomes into two main groups matching large-scale oceanic current temperatures, and six finer proteotypes that connect distant oceanic regions. These findings suggest that environmentally-mediated selection plays a critical role in the journey of cosmopolitan surface ocean microbial populations, and the idea 'everything is everywhere but the environment selects' has credence even at the finest resolutions.
Collapse
Affiliation(s)
- Tom O Delmont
- Department of Medicine, The University of Chicago, Chicago, United States
| | - Evan Kiefl
- Department of Medicine, The University of Chicago, Chicago, United States.,Graduate Program in Biophysical Sciences, University of Chicago, Chicago, United States
| | - Ozsel Kilinc
- Department of Electrical Engineering, University of South Florida, Tampa, United States
| | - Ozcan C Esen
- Department of Medicine, The University of Chicago, Chicago, United States
| | - Ismail Uysal
- Department of Electrical Engineering, University of South Florida, Tampa, United States
| | - Michael S Rappé
- Hawaii Institute of Marine Biology, University of Hawaii at Manoa, Kaneohe, United States
| | - Steven Giovannoni
- Department of Microbiology, Oregon State University, Corvallis, United States
| | - A Murat Eren
- Department of Medicine, The University of Chicago, Chicago, United States.,Marine Biological Laboratory, Woods Hole, United States
| |
Collapse
|
40
|
Amos W. Flanking heterozygosity influences the relative probability of different base substitutions in humans. ROYAL SOCIETY OPEN SCIENCE 2019; 6:191018. [PMID: 31598319 PMCID: PMC6774961 DOI: 10.1098/rsos.191018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 08/30/2019] [Indexed: 06/10/2023]
Abstract
Understanding when, where and which mutations are mostly likely to occur impacts many areas of evolutionary biology, from genetic diseases to phylogenetic reconstruction. Africans and non-African humans differ in the mutability of different triplet base combinations. Africans and non-Africans also differ in mutation rate, possibly because heterozygosity is mutagenic, such that diversity lost when humans expanded out of Africa also lowered the mutation rate. I show that these phenomena are linked: as flanking heterozygosity increases, some triplets become progressively more mutable while others become less so. Africans and non-African show near-identical patterns of dependence on heterozygosity. Thus, the striking differences in triplet mutation frequency between Africans and non-Africans, at least in part, seem to be an emergent property, driven by the way changes in heterozygosity 'out of Africa' have differentially impacted the mutability of different triplets. As heterozygosity decreased, the mutation spectrum outside Africa became enriched for triplet mutations that are favoured by low heterozygosity while those favoured by high heterozygosity became relatively rarer.
Collapse
|
41
|
Rana S, Valentin K, Bartsch I, Glöckner G. Loss of a chloroplast encoded function could influence species range in kelp. Ecol Evol 2019; 9:8759-8770. [PMID: 31410278 PMCID: PMC6686309 DOI: 10.1002/ece3.5428] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 05/16/2019] [Accepted: 06/15/2019] [Indexed: 12/25/2022] Open
Abstract
Kelps are important providers and constituents of marine ecological niches, the coastal kelp forests. Kelp species have differing distribution ranges, but mainly thrive in temperate and arctic regions. Although the principal factors determining biogeographic distribution ranges are known, genomics could provide additional answers to this question. We sequenced DNA from two Laminaria species with contrasting distribution ranges, Laminaria digitata and Laminaria solidungula. Laminaria digitata is found in the Northern Atlantic with a southern boundary in Brittany (France) or Massachusetts (USA) and a northern boundary in the Arctic, whereas L. solidungula is endemic to the Arctic only. From the raw reads of DNA, we reconstructed both chloroplast genomes and annotated them. A concatenated data set of all available brown algae chloroplast sequences was used for the calculation of a robust phylogeny, and sequence variations were analyzed. The two Laminaria chloroplast genomes are collinear to previously analyzed kelp chloroplast genomes with important exceptions. Rearrangements at the inverted repeat regions led to the pseudogenization of ycf37 in L. solidungula, a gene possibly required under high light conditions. This defunct gene might be one of the reasons why the habitat range of L. solidungula is restricted to lowlight sublittoral sites in the Arctic. The inheritance pattern of single nucleotide polymorphisms suggests incomplete lineage sorting of chloroplast genomes in kelp species. Our analysis of kelp chloroplast genomes shows that not only evolutionary information could be gleaned from sequence data. Concomitantly, those sequences can also tell us something about the ecological conditions which are required for species well-being.
Collapse
Affiliation(s)
- Shivani Rana
- Medical Faculty, Institute of Biochemistry IUniversity of CologneCologneGermany
| | - Klaus Valentin
- Alfred‐Wegener‐Institute, Helmholtz Center for Marine and Polar ResearchBremerhavenGermany
| | - Inka Bartsch
- Alfred‐Wegener‐Institute, Helmholtz Center for Marine and Polar ResearchBremerhavenGermany
| | - Gernot Glöckner
- Medical Faculty, Institute of Biochemistry IUniversity of CologneCologneGermany
| |
Collapse
|
42
|
Kaplanis J, Akawi N, Gallone G, McRae JF, Prigmore E, Wright CF, Fitzpatrick DR, Firth HV, Barrett JC, Hurles ME. Exome-wide assessment of the functional impact and pathogenicity of multinucleotide mutations. Genome Res 2019; 29:1047-1056. [PMID: 31227601 PMCID: PMC6633265 DOI: 10.1101/gr.239756.118] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 05/24/2019] [Indexed: 01/25/2023]
Abstract
Approximately 2% of de novo single-nucleotide variants (SNVs) appear as part of clustered mutations that create multinucleotide variants (MNVs). MNVs are an important source of genomic variability as they are more likely to alter an encoded protein than a SNV, which has important implications in disease as well as evolution. Previous studies of MNVs have focused on their mutational origins and have not systematically evaluated their functional impact and contribution to disease. We identified 69,940 MNVs and 91 de novo MNVs in 6688 exome-sequenced parent–offspring trios from the Deciphering Developmental Disorders Study comprising families with severe developmental disorders. We replicated the previously described MNV mutational signatures associated with DNA polymerase zeta, an error-prone translesion polymerase, and the APOBEC family of DNA deaminases. We estimate the simultaneous MNV germline mutation rate to be 1.78 × 10−10 mutations per base pair per generation. We found that most MNVs within a single codon create a missense change that could not have been created by a SNV. MNV-induced missense changes were, on average, more physicochemically divergent, were more depleted in highly constrained genes (pLI ≥ 0.9), and were under stronger purifying selection compared with SNV-induced missense changes. We found that de novo MNVs were significantly enriched in genes previously associated with developmental disorders in affected children. This shows that MNVs can be more damaging than SNVs even when both induce missense changes, and are an important variant type to consider in relation to human disease.
Collapse
Affiliation(s)
- Joanna Kaplanis
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Nadia Akawi
- Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DU, United Kingdom
| | - Giuseppe Gallone
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Jeremy F McRae
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Elena Prigmore
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Caroline F Wright
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, EX2 5DW, United Kingdom
| | - David R Fitzpatrick
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
| | - Helen V Firth
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom.,Department of Clinical Genetics, Cambridge University Hospitals NHS Foundation Trust, Cambridge, CB2 0QQ, United Kingdom
| | - Jeffrey C Barrett
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | - Matthew E Hurles
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom
| | | |
Collapse
|
43
|
Prendergast JGD, Pugh C, Harris SE, Hume DA, Deary IJ, Beveridge A. Linked Mutations at Adjacent Nucleotides Have Shaped Human Population Differentiation and Protein Evolution. Genome Biol Evol 2019; 11:759-775. [PMID: 30689878 PMCID: PMC6424222 DOI: 10.1093/gbe/evz014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/18/2019] [Indexed: 02/06/2023] Open
Abstract
Despite the fundamental importance of single nucleotide polymorphisms (SNPs) to human evolution, there are still large gaps in our understanding of the forces that shape their distribution across the genome. SNPs have been shown to not be distributed evenly, with directly adjacent SNPs found unusually frequently. Why this is the case is unclear. We illustrate how neighboring SNPs that cannot be explained by a single mutation event (that we term here sequential dinucleotide mutations [SDMs]) are driven by distinct processes to SNPs and multinucleotide polymorphisms (MNPs). By studying variation across populations, including a novel cohort of 1,358 Scottish genomes, we show that, SDMs are over twice as common as MNPs and like SNPs display distinct mutational spectra across populations. These biases are not only different to those observed among SNPs and MNPs but are also more divergent between human population groups. We show that the changes that make up SDMs are not independent and identify a distinct mutational profile, CA → CG → TG, that is observed an order of magnitude more often than expected from background SNP rates and the numbers of other SDMs involving the gain and deamination of CpG sites. Intriguingly particular pathways through the amino acid code appear to have been favored relative to that expected from intergenic SDM rates and the occurrences of coding SNPs, and in particular those that lead to the creation of single codon amino acids. We finally present evidence that epistatic selection has potentially disfavored sequential nonsynonymous changes in the human genome.
Collapse
Affiliation(s)
| | - Carys Pugh
- The Roslin Institute, The University of Edinburgh, Midlothian, United Kingdom.,Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom
| | - Sarah E Harris
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom.,Centre for Genomic and Experimental Medicine, MRC Institute of Genetics and Molecular Medicine, The University of Edinburgh, United Kingdom
| | - David A Hume
- Mater Research Institute-University of Queensland, Woolloongabba, Queensland, Australia
| | - Ian J Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, The University of Edinburgh, United Kingdom
| | - Allan Beveridge
- Glasgow Polyomics, College of Medical, Veterinary and Life Science, University of Glasgow, United Kingdom
| |
Collapse
|
44
|
Ghafari M, Weissman DB. The expected time to cross extended fitness plateaus. Theor Popul Biol 2019; 129:54-67. [PMID: 31054850 DOI: 10.1016/j.tpb.2019.03.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2018] [Revised: 12/28/2018] [Accepted: 03/05/2019] [Indexed: 10/25/2022]
Abstract
For a population to acquire a complex adaptation requiring multiple individually neutral mutations, it must cross a plateau in the fitness landscape. We consider plateaus involving three mutations, and show that large populations can cross them rapidly via lineages that acquire multiple mutations while remaining at low frequency, much faster than the ∝μ3 rate for simultaneous triple mutations. Plateau-crossing is fastest for very large populations. At intermediate population sizes, recombination can greatly accelerate adaptation by combining independent mutant lineages to form triple-mutants. For more frequent recombination, such that the population is kept near linkage equilibrium, we extend our analysis to find simple expressions for the expected time to cross plateaus of arbitrary width.
Collapse
Affiliation(s)
- Mahan Ghafari
- Department of Physics, Emory University, Atlanta, GA 30322, USA; Department of Genetics, University of Cambridge, UK
| | | |
Collapse
|
45
|
Bowden R, Davies RW, Heger A, Pagnamenta AT, de Cesare M, Oikkonen LE, Parkes D, Freeman C, Dhalla F, Patel SY, Popitsch N, Ip CLC, Roberts HE, Salatino S, Lockstone H, Lunter G, Taylor JC, Buck D, Simpson MA, Donnelly P. Sequencing of human genomes with nanopore technology. Nat Commun 2019; 10:1869. [PMID: 31015479 PMCID: PMC6478738 DOI: 10.1038/s41467-019-09637-5] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 03/19/2019] [Indexed: 12/17/2022] Open
Abstract
Whole-genome sequencing (WGS) is becoming widely used in clinical medicine in diagnostic contexts and to inform treatment choice. Here we evaluate the potential of the Oxford Nanopore Technologies (ONT) MinION long-read sequencer for routine WGS by sequencing the reference sample NA12878 and the genome of an individual with ataxia-pancytopenia syndrome and severe immune dysregulation. We develop and apply a novel reference panel-free analytical method to infer and then exploit phase information which improves single-nucleotide variant (SNV) calling performance from otherwise modest levels. In the clinical sample, we identify and directly phase two non-synonymous de novo variants in SAMD9L, (OMIM #159550) inferring that they lie on the same paternal haplotype. Whilst consensus SNV-calling error rates from ONT data remain substantially higher than those from short-read methods, we demonstrate the substantial benefits of analytical innovation. Ongoing improvements to base-calling and SNV-calling methodology must continue for nanopore sequencing to establish itself as a primary method for clinical WGS.
Collapse
Affiliation(s)
- Rory Bowden
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Robert W Davies
- Genomics plc, Oxford, OX1 1JD, UK
- Program in Genetics and Genomic Biology and The Centre for Applied Genomics, Hospital for Sick Children, Toronto, M5G 0A4, Canada
| | | | - Alistair T Pagnamenta
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
- National Institute for Health Research Oxford Biomedical Research Centre, Oxford, OX4 2PG, UK
| | | | - Laura E Oikkonen
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Duncan Parkes
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Colin Freeman
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Fatima Dhalla
- Department of Clinical Immunology, Oxford University Hospitals, Oxford, OX3 9DU, UK
- Developmental Immunology Group, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, OX3 9DS, UK
| | - Smita Y Patel
- Department of Clinical Immunology, Oxford University Hospitals, Oxford, OX3 9DU, UK
- Clinical Immunology Group, National Institute for Health Research Oxford Biomedical Research Centre, Oxford, OX4 2PG, UK
| | - Niko Popitsch
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
- National Institute for Health Research Oxford Biomedical Research Centre, Oxford, OX4 2PG, UK
- Children's Cancer Research Institute, St. Anna Kinderkrebsforschung, 1090, Vienna, Austria
| | - Camilla L C Ip
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Hannah E Roberts
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Silvia Salatino
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Helen Lockstone
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | - Gerton Lunter
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
- Genomics plc, Oxford, OX1 1JD, UK
| | - Jenny C Taylor
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
- National Institute for Health Research Oxford Biomedical Research Centre, Oxford, OX4 2PG, UK
| | - David Buck
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
| | | | - Peter Donnelly
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK.
- Genomics plc, Oxford, OX1 1JD, UK.
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK.
| |
Collapse
|
46
|
Dunn KA, Kenney T, Gu H, Bielawski JP. Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates. BMC Evol Biol 2019; 19:22. [PMID: 30642241 PMCID: PMC6332903 DOI: 10.1186/s12862-018-1326-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/11/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An excess of nonsynonymous substitutions, over neutrality, is considered evidence of positive Darwinian selection. Inference for proteins often relies on estimation of the nonsynonymous to synonymous ratio (ω = dN/dS) within a codon model. However, to ease computational difficulties, ω is typically estimated assuming an idealized substitution process where (i) all nonsynonymous substitutions have the same rate (regardless of impact on organism fitness) and (ii) instantaneous double and triple (DT) nucleotide mutations have zero probability (despite evidence that they can occur). It follows that estimates of ω represent an imperfect summary of the intensity of selection, and that tests based on the ω > 1 threshold could be negatively impacted. RESULTS We developed a general-purpose parametric (GPP) modelling framework for codons. This novel approach allows specification of all possible instantaneous codon substitutions, including multiple nonsynonymous rates (MNRs) and instantaneous DT nucleotide changes. Existing codon models are specified as special cases of the GPP model. We use GPP models to implement likelihood ratio tests for ω > 1 that accommodate MNRs and DT mutations. Through both simulation and real data analysis, we find that failure to model MNRs and DT mutations reduces power in some cases and inflates false positives in others. False positives under traditional M2a and M8 models were very sensitive to DT changes. This was exacerbated by the choice of frequency parameterization (GY vs. MG), with rates sometimes > 90% under MG. By including MNRs and DT mutations, accuracy and power was greatly improved under the GPP framework. However, we also find that over-parameterized models can perform less well, and this can contribute to degraded performance of LRTs. CONCLUSIONS We suggest GPP models should be used alongside traditional codon models. Further, all codon models should be deployed within an experimental design that includes (i) assessing robustness to model assumptions, and (ii) investigation of non-standard behaviour of MLEs. As the goal of every analysis is to avoid false conclusions, more work is needed on model selection methods that consider both the increase in fit engendered by a model parameter and the degree to which that parameter is affected by un-modelled evolutionary processes.
Collapse
Affiliation(s)
- Katherine A. Dunn
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Toby Kenney
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Hong Gu
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
| | - Joseph P. Bielawski
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4J1 Canada
- Centre Comparative Genomics and Evolutionary Bioinformatics (CGEB) at Dalhousie University, Halifax, Canada
| |
Collapse
|
47
|
Looking for Darwin in Genomic Sequences: Validity and Success Depends on the Relationship Between Model and Data. Methods Mol Biol 2019; 1910:399-426. [PMID: 31278672 DOI: 10.1007/978-1-4939-9074-0_13] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Codon substitution models (CSMs) are commonly used to infer the history of natural section for a set of protein-coding sequences, often with the explicit goal of detecting the signature of positive Darwinian selection. However, the validity and success of CSMs used in conjunction with the maximum likelihood (ML) framework is sometimes challenged with claims that the approach might too often support false conclusions. In this chapter, we use a case study approach to identify four legitimate statistical difficulties associated with inference of evolutionary events using CSMs. These include: (1) model misspecification, (2) low information content, (3) the confounding of processes, and (4) phenomenological load, or PL. While past criticisms of CSMs can be connected to these issues, the historical critiques were often misdirected, or overstated, because they failed to recognize that the success of any model-based approach depends on the relationship between model and data. Here, we explore this relationship and provide a candid assessment of the limitations of CSMs to extract historical information from extant sequences. To aid in this assessment, we provide a brief overview of: (1) a more realistic way of thinking about the process of codon evolution framed in terms of population genetic parameters, and (2) a novel presentation of the ML statistical framework. We then divide the development of CSMs into two broad phases of scientific activity and show that the latter phase is characterized by increases in model complexity that can sometimes negatively impact inference of evolutionary mechanisms. Such problems are not yet widely appreciated by the users of CSMs. These problems can be avoided by using a model that is appropriate for the data; but, understanding the relationship between the data and a fitted model is a difficult task. We argue that the only way to properly understand that relationship is to perform in silico experiments using a generating process that can mimic the data as closely as possible. The mutation-selection modeling framework (MutSel) is presented as the basis of such a generating process. We contend that if complex CSMs continue to be developed for testing explicit mechanistic hypotheses, then additional analyses such as those described in here (e.g., penalized LRTs and estimation of PL) will need to be applied alongside the more traditional inferential methods.
Collapse
|
48
|
Fine-Grained Analysis of Spontaneous Mutation Spectrum and Frequency in Arabidopsis thaliana. Genetics 2018; 211:703-714. [PMID: 30514707 PMCID: PMC6366913 DOI: 10.1534/genetics.118.301721] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 11/29/2018] [Indexed: 01/17/2023] Open
Abstract
Mutations are the ultimate source of all genetic variation. However, few direct estimates of the contribution of mutation to molecular genetic variation are available. To address this issue, we first analyzed the rate and spectrum of mutations in the Arabidopsis thaliana reference accession after 25 generations of single-seed descent. We then compared the mutation profile in these mutation accumulation (MA) lines against genetic variation observed in the 1001 Genomes Project. The estimated haploid single nucleotide mutation (SNM) rate for A. thaliana is 6.95 × 10−9 (SE ± 2.68 × 10−10) per site per generation, with SNMs having higher frequency in transposable elements (TEs) and centromeric regions. The estimated indel mutation rate is 1.30 × 10−9 (±1.07 × 10−10) per site per generation, with deletions being more frequent and larger than insertions. Among the 1694 unique SNMs identified in the MA lines, the positions of 389 SNMs (23%) coincide with biallelic SNPs from the 1001 Genomes population, and in 289 (17%) cases the changes are identical. Of the 329 unique indels identified in the MA lines, 96 (29%) overlap with indels from the 1001 Genomes dataset, and 16 indels (5% of the total) are identical. These overlap frequencies are significantly higher than expected, suggesting that de novo mutations are not uniformly distributed and arise at polymorphic sites more frequently than assumed. These results suggest that high mutation rate potentially contributes to high polymorphism and low mutation rate to reduced polymorphism in natural populations providing insights of mutational inputs in generating natural genetic diversity.
Collapse
|
49
|
Senra MVX, Sung W, Ackerman M, Miller SF, Lynch M, Soares CAG. An Unbiased Genome-Wide View of the Mutation Rate and Spectrum of the Endosymbiotic Bacterium Teredinibacter turnerae. Genome Biol Evol 2018; 10:723-730. [PMID: 29415256 PMCID: PMC5833318 DOI: 10.1093/gbe/evy027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2018] [Indexed: 12/14/2022] Open
Abstract
Mutations contribute to genetic variation in all living systems. Thus, precise estimates of mutation rates and spectra across a diversity of organisms are required for a full comprehension of evolution. Here, a mutation-accumulation (MA) assay was carried out on the endosymbiotic bacterium Teredinibacter turnerae. After ∼3,025 generations, base-pair substitutions (BPSs) and insertion–deletion (indel) events were characterized by whole-genome sequencing analysis of 47 independent MA lines, yielding a BPS rate of 1.14 × 10−9 per site per generation and indel rate of 1.55 × 10−10 events per site per generation, which are among the highest within free-living and facultative intracellular bacteria. As in other endosymbionts, a significant bias of BPSs toward A/T and an excess of deletion mutations over insertion mutations are observed for these MA lines. However, even with a deletion bias, the genome remains relatively large (∼5.2 Mb) for an endosymbiotic bacterium. The estimate of the effective population size (Ne) in T. turnerae is quite high and comparable to free-living bacteria (∼4.5 × 107), suggesting that the heavy bottlenecking associated with many endosymbiotic relationships is not prevalent during the life of this endosymbiont. The efficiency of selection scales with increasing Ne and such strong selection may have been operating against the deletion bias, preventing genome erosion. The observed mutation rate in this endosymbiont is of the same order of magnitude of those with similar Ne, consistent with the idea that population size is a primary determinant of mutation-rate evolution within endosymbionts, and that not all endosymbionts have low Ne.
Collapse
Affiliation(s)
- Marcus V X Senra
- Departamento de Zoologia, Universidade Federal de Juiz de Fora, Brazil
| | - Way Sung
- Department of Bioinformatics and Genomics, University of North Carolina, Charlotte
| | - Matthew Ackerman
- Biodesign Center for Mechanisms of Evolution, Arizona State University
| | - Samuel F Miller
- Biodesign Center for Mechanisms of Evolution, Arizona State University
| | - Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University
| | - Carlos Augusto G Soares
- Departamento de Genética, Universidade Federal do Rio de Janeiro, Brazil
- Corresponding author: E-mail:
| |
Collapse
|
50
|
Thomas GWC, Wang RJ, Puri A, Harris RA, Raveendran M, Hughes DST, Murali SC, Williams LE, Doddapaneni H, Muzny DM, Gibbs RA, Abee CR, Galinski MR, Worley KC, Rogers J, Radivojac P, Hahn MW. Reproductive Longevity Predicts Mutation Rates in Primates. Curr Biol 2018; 28:3193-3197.e5. [PMID: 30270182 DOI: 10.1016/j.cub.2018.08.050] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 07/26/2018] [Accepted: 08/22/2018] [Indexed: 12/30/2022]
Abstract
Mutation rates vary between species across several orders of magnitude, with larger organisms having the highest per-generation mutation rates. Hypotheses for this pattern typically invoke physiological or population-genetic constraints imposed on the molecular machinery preventing mutations [1]. However, continuing germline cell division in multicellular eukaryotes means that organisms with longer generation times and of larger size will leave more mutations to their offspring simply as a byproduct of their increased lifespan [2, 3]. Here, we deeply sequence the genomes of 30 owl monkeys (Aotus nancymaae) from six multi-generation pedigrees to demonstrate that paternal age is the major factor determining the number of de novo mutations in this species. We find that owl monkeys have an average mutation rate of 0.81 × 10-8 per site per generation, roughly 32% lower than the estimate in humans. Based on a simple model of reproductive longevity that does not require any changes to the mutational machinery, we show that this is the expected mutation rate in owl monkeys. We further demonstrate that our model predicts species-specific mutation rates in other primates, including study-specific mutation rates in humans based on the average paternal age. Our results suggest that variation in life history traits alone can explain variation in the per-generation mutation rate among primates, and perhaps among a wide range of multicellular organisms.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Department of Biology, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA; Department of Computer Science, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA.
| | - Richard J Wang
- Department of Biology, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA
| | - Arthi Puri
- Department of Computer Science, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA
| | - R Alan Harris
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Daniel S T Hughes
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Shwetha C Murali
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Lawrence E Williams
- Keeling Center for Comparative Medicine and Research, University of Texas, MD Anderson Cancer Center, 650 Cool Water Drive, Bastrop, TX 78602, USA
| | - Harsha Doddapaneni
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Christian R Abee
- Keeling Center for Comparative Medicine and Research, University of Texas, MD Anderson Cancer Center, 650 Cool Water Drive, Bastrop, TX 78602, USA
| | - Mary R Galinski
- Emory Vaccine Center, Yerkes National Primate Research Center, Emory University, 201 Dowman Drive, Atlanta, GA, USA; Division of Infectious Diseases, Department of Medicine, Emory University, 201 Dowman Drive, Atlanta, GA, USA
| | - Kim C Worley
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Predrag Radivojac
- Department of Computer Science, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA; Department of Computer Science, Indiana University, 107 S. Indiana Avenue, Bloomington, IN 47405, USA.
| |
Collapse
|