1
|
Wade EE, Kyriazis CC, Cavassim MIA, Lohmueller KE. Quantifying the fraction of new mutations that are recessive lethal. Evolution 2023; 77:1539-1549. [PMID: 37074880 PMCID: PMC10309970 DOI: 10.1093/evolut/qpad061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 04/14/2023] [Indexed: 04/20/2023]
Abstract
The presence and impact of recessive lethal mutations have been widely documented in diploid outcrossing species. However, precise estimates of the proportion of new mutations that are recessive lethal remain limited. Here, we evaluate the performance of Fit∂a∂i, a commonly used method for inferring the distribution of fitness effects (DFE), in the presence of lethal mutations. Using simulations, we demonstrate that in both additive and recessive cases, inference of the deleterious nonlethal portion of the DFE is minimally affected by a small proportion (<10%) of lethal mutations. Additionally, we demonstrate that while Fit∂a∂i cannot estimate the fraction of recessive lethal mutations, Fit∂a∂i can accurately infer the fraction of additive lethal mutations. Finally, as an alternative approach to estimate the proportion of mutations that are recessive lethal, we employ models of mutation-selection-drift balance using existing genomic parameters and estimates of segregating recessive lethals for humans and Drosophila melanogaster. In both species, the segregating recessive lethal load can be explained by a very small fraction (<1%) of new nonsynonymous mutations being recessive lethal. Our results refute recent assertions of a much higher proportion of mutations being recessive lethal (4%-5%), while highlighting the need for additional information on the joint distribution of selection and dominance coefficients.
Collapse
Affiliation(s)
- Emma E Wade
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
- Department of Computer Science and Engineering, Mississippi State University, Starkville, MS, United States
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
| | - Maria Izabel A Cavassim
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
- Interdepartmental Program in Bioinformatics, University of California–Los Angeles, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine, University of California–Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
2
|
Lauterbur ME, Cavassim MIA, Gladstein AL, Gower G, Pope NS, Tsambos G, Adrion J, Belsare S, Biddanda A, Caudill V, Cury J, Echevarria I, Haller BC, Hasan AR, Huang X, Iasi LNM, Noskova E, Obsteter J, Pavinato VAC, Pearson A, Peede D, Perez MF, Rodrigues MF, Smith CCR, Spence JP, Teterina A, Tittes S, Unneberg P, Vazquez JM, Waples RK, Wohns AW, Wong Y, Baumdicker F, Cartwright RA, Gorjanc G, Gutenkunst RN, Kelleher J, Kern AD, Ragsdale AP, Ralph PL, Schrider DR, Gronau I. Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. eLife 2023; 12:RP84874. [PMID: 37342968 DOI: 10.7554/elife.84874] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2023] Open
Abstract
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.
Collapse
Affiliation(s)
- M Elise Lauterbur
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, United States
| | - Maria Izabel A Cavassim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, United States
| | | | - Graham Gower
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Nathaniel S Pope
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Georgia Tsambos
- School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Jeffrey Adrion
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
- Ancestry DNA, San Francisco, United States
| | - Saurabh Belsare
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | | | - Victoria Caudill
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Jean Cury
- Universite Paris-Saclay, CNRS, INRIA, Laboratoire Interdisciplinaire des Sciences du Numerique, Orsay, France
| | | | - Benjamin C Haller
- Department of Computational Biology, Cornell University, Ithaca, United States
| | - Ahmed R Hasan
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
- Department of Biology, University of Toronto Mississauga, Mississauga, Canada
| | - Xin Huang
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | | | - Ekaterina Noskova
- Computer Technologies Laboratory, ITMO University, St Petersburg, Russian Federation
| | - Jana Obsteter
- Agricultural Institute of Slovenia, Department of Animal Science, Ljubljana, Slovenia
| | | | - Alice Pearson
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - David Peede
- Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, United States
- Center for Computational Molecular Biology, Brown University, Providence, United States
| | - Manolo F Perez
- Department of Genetics and Evolution, Federal University of Sao Carlos, Sao Carlos, Brazil
| | - Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Chris C R Smith
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Jeffrey P Spence
- Department of Genetics, Stanford University School of Medicine, Stanford, United States
| | - Anastasia Teterina
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Silas Tittes
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Per Unneberg
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Juan Manuel Vazquez
- Department of Integrative Biology, University of California, Berkeley, Berkeley, United States
| | - Ryan K Waples
- Department of Biostatistics, University of Washington, Seattle, United States
| | | | - Yan Wong
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Franz Baumdicker
- Cluster of Excellence - Controlling Microbes to Fight Infections, Eberhard Karls Universit¨at Tubingen, Tubingen, Germany
| | - Reed A Cartwright
- School of Life Sciences and The Biodesign Institute, Arizona State University, Tempe, United States
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, United States
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Andrew D Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Aaron P Ragsdale
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, United States
| | - Peter L Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
- Department of Mathematics, University of Oregon, Eugene, United States
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, United States
| | - Ilan Gronau
- Efi Arazi School of Computer Science, Reichman University, Herzliya, Israel
| |
Collapse
|
3
|
Abstract
Homologous recombination is expected to increase natural selection efficacy by decoupling the fate of beneficial and deleterious mutations and by readily creating new combinations of beneficial alleles. Here, we investigate how the proportion of amino acid substitutions fixed by adaptive evolution (α) depends on the recombination rate in bacteria. We analyze 3,086 core protein-coding sequences from 196 genomes belonging to five closely related species of the genus Rhizobium. These genes are found in all species and do not display any signs of introgression between species. We estimate α using the site frequency spectrum (SFS) and divergence data for all pairs of species. We evaluate the impact of recombination within each species by dividing genes into three equally sized recombination classes based on their average level of intragenic linkage disequilibrium. We find that α varies from 0.07 to 0.39 across species and is positively correlated with the level of recombination. This is both due to a higher estimated rate of adaptive evolution and a lower estimated rate of nonadaptive evolution, suggesting that recombination both increases the fixation probability of advantageous variants and decreases the probability of fixation of deleterious variants. Our results demonstrate that homologous recombination facilitates adaptive evolution measured by α in the core genome of prokaryote species in agreement with studies in eukaryotes.
Collapse
Affiliation(s)
- Maria Izabel A Cavassim
- Bioinformatics Research Centre, Aarhus University, Aarhus, 8000, Denmark.,Department of Molecular Biology and Genetics, Aarhus University, Aarhus, 8000, Denmark
| | - Stig U Andersen
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, 8000, Denmark
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, 8000, Denmark
| | | |
Collapse
|
4
|
Young JPW, Moeskjær S, Afonin A, Rahi P, Maluk M, James EK, Cavassim MIA, Rashid MHO, Aserse AA, Perry BJ, Wang ET, Velázquez E, Andronov EE, Tampakaki A, Flores Félix JD, Rivas González R, Youseif SH, Lepetit M, Boivin S, Jorrin B, Kenicer GJ, Peix Á, Hynes MF, Ramírez-Bahena MH, Gulati A, Tian CF. Defining the Rhizobium leguminosarum Species Complex. Genes (Basel) 2021; 12:111. [PMID: 33477547 PMCID: PMC7831135 DOI: 10.3390/genes12010111] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 01/08/2021] [Accepted: 01/13/2021] [Indexed: 01/21/2023] Open
Abstract
Bacteria currently included in Rhizobium leguminosarum are too diverse to be considered a single species, so we can refer to this as a species complex (the Rlc). We have found 429 publicly available genome sequences that fall within the Rlc and these show that the Rlc is a distinct entity, well separated from other species in the genus. Its sister taxon is R. anhuiense. We constructed a phylogeny based on concatenated sequences of 120 universal (core) genes, and calculated pairwise average nucleotide identity (ANI) between all genomes. From these analyses, we concluded that the Rlc includes 18 distinct genospecies, plus 7 unique strains that are not placed in these genospecies. Each genospecies is separated by a distinct gap in ANI values, usually at approximately 96% ANI, implying that it is a 'natural' unit. Five of the genospecies include the type strains of named species: R. laguerreae, R. sophorae, R. ruizarguesonis, "R. indicum" and R. leguminosarum itself. The 16S ribosomal RNA sequence is remarkably diverse within the Rlc, but does not distinguish the genospecies. Partial sequences of housekeeping genes, which have frequently been used to characterize isolate collections, can mostly be assigned unambiguously to a genospecies, but alleles within a genospecies do not always form a clade, so single genes are not a reliable guide to the true phylogeny of the strains. We conclude that access to a large number of genome sequences is a powerful tool for characterizing the diversity of bacteria, and that taxonomic conclusions should be based on all available genome sequences, not just those of type strains.
Collapse
Affiliation(s)
| | - Sara Moeskjær
- Department of Molecular Biology and Genetics, Aarhus University, 8000 Aarhus, Denmark;
| | - Alexey Afonin
- Laboratory for Genetics of Plant-Microbe Interactions, ARRIAM, Pushkin, 196608 Saint-Petersburg, Russia;
| | - Praveen Rahi
- National Centre for Microbial Resource, National Centre for Cell Science, Pune 411007, India;
| | - Marta Maluk
- Ecological Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK; (M.M.); (E.K.J.)
| | - Euan K. James
- Ecological Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK; (M.M.); (E.K.J.)
| | - Maria Izabel A. Cavassim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA;
| | - M. Harun-or Rashid
- Biotechnology Division, Bangladesh Institute of Nuclear Agriculture (BINA), Mymensingh 2202, Bangladesh;
| | - Aregu Amsalu Aserse
- Ecosystems and Environment Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, FI-00014 Helsinki, Finland;
| | - Benjamin J. Perry
- Department of Microbiology and Immunology, University of Otago, Dunedin 9016, New Zealand;
| | - En Tao Wang
- Departamento de Microbiología, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Ciudad De México 11340, Mexico;
| | - Encarna Velázquez
- Departamento de Microbiología y Genética, Universidad de Salamanca, Instituto Hispanoluso de Investigaciones Agrarias (CIALE), Unidad Asociada Grupo de Interacción planta-microorganismo (Universidad de Salamanca-IRNASA-CSIC), 37007 Salamanca, Spain; (E.V.); (R.R.G.)
| | - Evgeny E. Andronov
- Department of Microbial Monitoring, ARRIAM, Pushkin, 196608 Saint-Petersburg, Russia;
| | - Anastasia Tampakaki
- Department of Crop Science, Agricultural University of Athens, Iera Odos 75, Votanikos, 11855 Athens, Greece;
| | - José David Flores Félix
- CICS-UBI—Health Sciences Research Centre, University of Beira Interior, 6201-506 Covilhã, Portugal;
| | - Raúl Rivas González
- Departamento de Microbiología y Genética, Universidad de Salamanca, Instituto Hispanoluso de Investigaciones Agrarias (CIALE), Unidad Asociada Grupo de Interacción planta-microorganismo (Universidad de Salamanca-IRNASA-CSIC), 37007 Salamanca, Spain; (E.V.); (R.R.G.)
| | - Sameh H. Youseif
- Department of Microbial Genetic Resources, National Gene Bank (NGB), Agricultural Research Center (ARC), Giza 12619, Egypt;
| | - Marc Lepetit
- Institut Sophia Agrobiotech, UMR INRAE 1355, Université Côte d’Azur, CNRS, 06903 Sophia Antipolis, France;
| | - Stéphane Boivin
- Laboratoire des Symbioses Tropicales et Méditerranéennes, UMR INRAE-IRD-CIRAD-UM2-SupAgro, Campus International de Baillarguet, TA-A82/J, CEDEX 05, 34398 Montpellier, France;
| | - Beatriz Jorrin
- Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK;
| | - Gregory J. Kenicer
- Royal Botanic Garden Edinburgh, 20A Inverleith Row, Edinburgh EH3 5LR, UK;
| | - Álvaro Peix
- Instituto de Recursos Naturales y Agrobiología de Salamanca (IRNASA-CSIC), Unidad Asociada Grupo de Interacción Planta-Microorganismo (Universidad de Salamanca-IRNASA-CSIC), 37008 Salamanca, Spain;
| | - Michael F. Hynes
- Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4, Canada;
| | - Martha Helena Ramírez-Bahena
- Departamento de Didáctica de las Matemáticas y de las Ciencias Experimentales. Universidad de Salamanca, 37008 Salamanca, Spain;
| | - Arvind Gulati
- Microbial Prospection, CSIR-Institute of Himalayan Bioresource Technology, Palampur (H.P.) 176 061, India;
| | - Chang-Fu Tian
- State Key Laboratory of Agrobiotechnology, Rhizobium Research Center, and College of Biological Sciences, China Agricultural University, Beijing 100193, China;
| |
Collapse
|
5
|
Cavassim MIA, Moeskjær S, Moslemi C, Fields B, Bachmann A, Vilhjálmsson BJ, Schierup MH, W. Young JP, Andersen SU. Symbiosis genes show a unique pattern of introgression and selection within a Rhizobium leguminosarum species complex. Microb Genom 2020; 6:e000351. [PMID: 32176601 PMCID: PMC7276703 DOI: 10.1099/mgen.0.000351] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 02/17/2020] [Indexed: 12/22/2022] Open
Abstract
Rhizobia supply legumes with fixed nitrogen using a set of symbiosis genes. These can cross rhizobium species boundaries, but it is unclear how many other genes show similar mobility. Here, we investigate inter-species introgression using de novo assembly of 196 Rhizobium leguminosarum sv. trifolii genomes. The 196 strains constituted a five-species complex, and we calculated introgression scores based on gene-tree traversal to identify 171 genes that frequently cross species boundaries. Rather than relying on the gene order of a single reference strain, we clustered the introgressing genes into four blocks based on population structure-corrected linkage disequilibrium patterns. The two largest blocks comprised 125 genes and included the symbiosis genes, a smaller block contained 43 mainly chromosomal genes, and the last block consisted of three genes with variable genomic location. All introgression events were likely mediated by conjugation, but only the genes in the symbiosis linkage blocks displayed overrepresentation of distinct, high-frequency haplotypes. The three genes in the last block were core genes essential for symbiosis that had, in some cases, been mobilized on symbiosis plasmids. Inter-species introgression is thus not limited to symbiosis genes and plasmids, but other cases are infrequent and show distinct selection signatures.
Collapse
Affiliation(s)
- Maria Izabel A. Cavassim
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Sara Moeskjær
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Camous Moslemi
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | | | - Asger Bachmann
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | | | | | | | - Stig U. Andersen
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| |
Collapse
|