1
|
McShea H, Weibel C, Wehbi S, Goodman P, James JE, Wheeler AL, Masel J. The effectiveness of selection in a species affects the direction of amino acid frequency evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.01.526552. [PMID: 38948853 PMCID: PMC11212923 DOI: 10.1101/2023.02.01.526552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Nearly neutral theory predicts that species with higher effective population size (N e ) are better able to purge slightly deleterious mutations. We compare evolution in high-N e vs. low-N e vertebrates to reveal which amino acid frequencies are subject to subtle selective preferences. We take three complementary approaches, two measuring flux and one measuring outcomes. First, we fit non-stationary substitution models of amino acid flux using maximum likelihood, comparing the high-N e clade of rodents and lagomorphs to its low-N e sister clade of primates and colugos. Second, we compare evolutionary outcomes across a wider range of vertebrates, via correlations between amino acid frequencies and N e . Third, we dissect the details of flux in human, chimpanzee, mouse, and rat, as scored by parsimony - this also enables comparison to a historical paper. All three methods agree on which amino acids are preferred under more effective selection. Preferred amino acids tend to be smaller, less costly to synthesize, and to promote intrinsic structural disorder. Parsimony-induced bias in the historical study produces an apparent reduction in structural disorder, perhaps driven by slightly deleterious substitutions. Within highly exchangeable pairs of amino acids, arginine is strongly preferred over lysine, and valine over isoleucine, consistent with more effective selection preferring a marginally larger free energy of folding. These two preferences match differences between thermophiles and mesophilic relatives. These results reveal the biophysical consequences of mutation-selection-drift balance, and demonstrate the utility of nearly neutral theory for understanding protein evolution.
Collapse
Affiliation(s)
- Hanon McShea
- Department of Earth System Science, Stanford University
| | - Catherine Weibel
- Department of Ecology & Evolutionary Biology, University of Arizona
- Department of Applied Physics, Stanford University
| | - Sawsan Wehbi
- Graduate Interdisciplinary Program in Genetics, University of Arizona
| | | | - Jennifer E James
- Department of Ecology & Evolutionary Biology, University of Arizona
- Department of Ecology and Genetics, Uppsala University
| | - Andrew L Wheeler
- Graduate Interdisciplinary Program in Genetics, University of Arizona
| | - Joanna Masel
- Department of Ecology & Evolutionary Biology, University of Arizona
| |
Collapse
|
2
|
Radrizzani S, Kudla G, Izsvák Z, Hurst LD. Selection on synonymous sites: the unwanted transcript hypothesis. Nat Rev Genet 2024; 25:431-448. [PMID: 38297070 DOI: 10.1038/s41576-023-00686-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2023] [Indexed: 02/02/2024]
Abstract
Although translational selection to favour codons that match the most abundant tRNAs is not readily observed in humans, there is nonetheless selection in humans on synonymous mutations. We hypothesize that much of this synonymous site selection can be explained in terms of protection against unwanted RNAs - spurious transcripts, mis-spliced forms or RNAs derived from transposable elements or viruses. We propose not only that selection on synonymous sites functions to reduce the rate of creation of unwanted transcripts (for example, through selection on exonic splice enhancers and cryptic splice sites) but also that high-GC content (but low-CpG content), together with intron presence and position, is both particular to functional native mRNAs and used to recognize transcripts as native. In support of this hypothesis, transcription, nuclear export, liquid phase condensation and RNA degradation have all recently been shown to promote GC-rich transcripts and suppress AU/CpG-rich ones. With such 'traps' being set against AU/CpG-rich transcripts, the codon usage of native genes has, in turn, evolved to avoid such suppression. That parallel filters against AU/CpG-rich transcripts also affect the endosomal import of RNAs further supports the unwanted transcript hypothesis of synonymous site selection and explains the similar design rules that have enabled the successful use of transgenes and RNA vaccines.
Collapse
Affiliation(s)
- Sofia Radrizzani
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK
- Milner Therapeutics Institute, Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Zsuzsanna Izsvák
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK.
| |
Collapse
|
3
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023; 380:eabn3943. [PMID: 37104599 PMCID: PMC10250106 DOI: 10.1126/science.abn3943] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 12/16/2022] [Indexed: 04/29/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M. Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G. Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C. Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M. Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M. Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B. Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C. Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L. Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R. Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W. Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R. Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M. Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E. Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A. Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P. Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E. Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R. Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian’s National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K. Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S. Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A. Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J. Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S. Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A. Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
4
|
Horvath R, Menon M, Stitzer M, Ross-Ibarra J. OUP accepted manuscript. Genome Biol Evol 2022; 14:6519160. [PMID: 35104327 PMCID: PMC8872973 DOI: 10.1093/gbe/evac016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2022] [Indexed: 11/23/2022] Open
Abstract
Recognition of the important role of transposable elements (TEs) in eukaryotic genomes quickly led to a burgeoning literature modeling and estimating the effects of selection on TEs. Much of the empirical work on selection has focused on analyzing the site frequency spectrum (SFS) of TEs. But TE evolution differs from standard models in a number of ways that can impact the power and interpretation of the SFS. For example, rather than mutating under a clock-like model, transposition often occurs in bursts which can inflate particular frequency categories compared with expectations under a standard neutral model. If a TE burst has been recent, the excess of low-frequency polymorphisms can mimic the effect of purifying selection. Here, we investigate how transposition bursts affect the frequency distribution of TEs and the correlation between age and allele frequency. Using information on the TE age distribution, we propose an age-adjusted SFS to compare TEs and neutral polymorphisms to more effectively evaluate whether TEs are under selective constraints. We show that our approach can minimize instances of false inference of selective constraint, remains robust to simple demographic changes, and allows for a correct identification of even weak selection affecting TEs which experienced a transposition burst. The results presented here will help researchers working on TEs to more reliably identify the effects of selection on TEs without having to rely on the assumption of a constant transposition rate.
Collapse
Affiliation(s)
- Robert Horvath
- Department of Evolution and Ecology, University of California, Davis, USA
- Corresponding authors: E-mails: ;
| | - Mitra Menon
- Department of Evolution and Ecology, University of California, Davis, USA
- Center for Population Biology, University of California, Davis, USA
| | - Michelle Stitzer
- Institute for Genomic Diversity and Department of Molecular Biology and Genetics, Cornell University, USA
| | - Jeffrey Ross-Ibarra
- Department of Evolution and Ecology, University of California, Davis, USA
- Center for Population Biology, University of California, Davis, USA
- Genome Center, University of California, Davis, USA
- Corresponding authors: E-mails: ;
| |
Collapse
|
5
|
Horvath R, Josephs EB, Pesquet E, Stinchcombe JR, Wright SI, Scofield D, Slotte T. Selection on Accessible Chromatin Regions in Capsella grandiflora. Mol Biol Evol 2021; 38:5563-5575. [PMID: 34498072 PMCID: PMC8662636 DOI: 10.1093/molbev/msab270] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Accurate estimates of genome-wide rates and fitness effects of new mutations are essential for an improved understanding of molecular evolutionary processes. Although eukaryotic genomes generally contain a large noncoding fraction, functional noncoding regions and fitness effects of mutations in such regions are still incompletely characterized. A promising approach to characterize functional noncoding regions relies on identifying accessible chromatin regions (ACRs) tightly associated with regulatory DNA. Here, we applied this approach to identify and estimate selection on ACRs in Capsella grandiflora, a crucifer species ideal for population genomic quantification of selection due to its favorable population demography. We describe a population-wide ACR distribution based on ATAC-seq data for leaf samples of 16 individuals from a natural population. We use population genomic methods to estimate fitness effects and proportions of positively selected fixations (α) in ACRs and find that intergenic ACRs harbor a considerable fraction of weakly deleterious new mutations, as well as a significantly higher proportion of strongly deleterious mutations than comparable inaccessible intergenic regions. ACRs are enriched for expression quantitative trait loci (eQTL) and depleted of transposable element insertions, as expected if intergenic ACRs are under selection because they harbor regulatory regions. By integrating empirical identification of intergenic ACRs with analyses of eQTL and population genomic analyses of selection, we demonstrate that intergenic regulatory regions are an important source of nearly neutral mutations. These results improve our understanding of selection on noncoding regions and the role of nearly neutral mutations for evolutionary processes in outcrossing Brassicaceae species.
Collapse
Affiliation(s)
- Robert Horvath
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, Lansing, MI, USA
| | - Edouard Pesquet
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Douglas Scofield
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Tanja Slotte
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| |
Collapse
|
6
|
Hayes K, Barton HJ, Zeng K. A Study of Faster-Z Evolution in the Great Tit (Parus major). Genome Biol Evol 2021; 12:210-222. [PMID: 32119100 PMCID: PMC7144363 DOI: 10.1093/gbe/evaa044] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/26/2020] [Indexed: 12/17/2022] Open
Abstract
Sex chromosomes contribute substantially to key evolutionary processes such as speciation and adaptation. Several theories suggest that evolution could occur more rapidly on sex chromosomes, but currently our understanding of whether and how this occurs is limited. Here, we present an analysis of the great tit (Parus major) genome, aiming to detect signals of faster-Z evolution. We find mixed evidence of faster divergence on the Z chromosome than autosomes, with significantly higher divergence being found in ancestral repeats, but not at 4- or 0-fold degenerate sites. Interestingly, some 4-fold sites appear to be selectively constrained, which may mislead analyses that use these sites as the neutral reference (e.g., dN/dS). Consistent with other studies in birds, the mutation rate is significantly higher in males than females, and the long-term Z-to-autosome effective population size ratio is only 0.5, significantly lower than the expected value of 0.75. These are indicative of male-driven evolution and high variance in male reproductive success, respectively. We find no evidence for an increased efficacy of positive selection on the Z chromosome. In contrast, the Z chromosome in great tits appears to be affected by increased genetic drift, which has led to detectable signals of weakened intensity of purifying selection. These results provide further evidence that the Z chromosome often has a low effective population size, and that this has important consequences for its evolution. They also highlight the importance of considering multiple factors that can affect the rate of evolution and effective population sizes of sex chromosomes.
Collapse
Affiliation(s)
- Kai Hayes
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| | - Henry J Barton
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom.,Organismal and Evolutionary Biology Research Program, University of Helsinki, Finland
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| |
Collapse
|
7
|
Inbred lab mice are not isogenic: genetic variation within inbred strains used to infer the mutation rate per nucleotide site. Heredity (Edinb) 2020; 126:107-116. [PMID: 32868871 PMCID: PMC7852876 DOI: 10.1038/s41437-020-00361-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 08/20/2020] [Accepted: 08/21/2020] [Indexed: 12/14/2022] Open
Abstract
For over a century, inbred mice have been used in many areas of genetics research to gain insight into the genetic variation underlying traits of interest. The generalizability of any genetic research study in inbred mice is dependent upon all individual mice being genetically identical, which in turn is dependent on the breeding designs of companies that supply inbred mice to researchers. Here, we compare whole-genome sequences from individuals of four commonly used inbred strains that were procured from either the colony nucleus or from a production colony (which can be as many as ten generations removed from the nucleus) of a large commercial breeder, in order to investigate the extent and nature of genetic variation within and between individuals. We found that individuals within strains are not isogenic, and there are differences in the levels of genetic variation that are explained by differences in the genetic distance from the colony nucleus. In addition, we employ a novel approach to mutation rate estimation based on the observed genetic variation and the expected site frequency spectrum at equilibrium, given a fully inbred breeding design. We find that it provides a reasonable per nucleotide mutation rate estimate when mice come from the colony nucleus (~7.9 × 10−9 in C3H/HeN), but substantially inflated estimates when mice come from production colonies.
Collapse
|
8
|
Integrated structural and evolutionary analysis reveals common mechanisms underlying adaptive evolution in mammals. Proc Natl Acad Sci U S A 2020; 117:5977-5986. [PMID: 32123117 DOI: 10.1073/pnas.1916786117] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to colocalize on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens previously unexplored strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: Components of immune response and metabolic enzymes. This gives a coherent picture of pathogens and xenobiotics as important drivers of adaptive evolution of mammals.
Collapse
|
9
|
Huang YF, Siepel A. Estimation of allele-specific fitness effects across human protein-coding sequences and implications for disease. Genome Res 2019; 29:1310-1321. [PMID: 31249063 PMCID: PMC6673719 DOI: 10.1101/gr.245522.118] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 06/20/2019] [Indexed: 12/16/2022]
Abstract
A central challenge in human genomics is to understand the cellular, evolutionary, and clinical significance of genetic variants. Here, we introduce a unified population-genetic and machine-learning model, called Linear Allele-Specific Selection InferencE (LASSIE), for estimating the fitness effects of all observed and potential single-nucleotide variants, based on polymorphism data and predictive genomic features. We applied LASSIE to 51 high-coverage genome sequences annotated with 33 genomic features and constructed a map of allele-specific selection coefficients across all protein-coding sequences in the human genome. This map is generally consistent with previous inferences of the bulk distribution of fitness effects but reveals pervasive weak negative selection against synonymous mutations. In addition, the estimated selection coefficients are highly predictive of inherited pathogenic variants and cancer driver mutations, outperforming state-of-the-art variant prioritization methods. By contrasting our estimated model with ultrahigh coverage ExAC exome-sequencing data, we identified 1118 genes under unusually strong negative selection, which tend to be exclusively expressed in the central nervous system or associated with autism spectrum disorder, as well as 773 genes under unusually weak selection, which tend to be associated with metabolism. This combination of classical population genetic theory with modern machine-learning and large-scale genomic data is a powerful paradigm for the study of both human evolution and disease.
Collapse
Affiliation(s)
- Yi-Fei Huang
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| |
Collapse
|
10
|
Johnsson M, Gaynor RC, Jenko J, Gorjanc G, de Koning DJ, Hickey JM. Removal of alleles by genome editing (RAGE) against deleterious load. Genet Sel Evol 2019; 51:14. [PMID: 30995904 PMCID: PMC6472060 DOI: 10.1186/s12711-019-0456-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 04/01/2019] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND In this paper, we simulate deleterious load in an animal breeding program, and compare the efficiency of genome editing and selection for decreasing it. Deleterious variants can be identified by bioinformatics screening methods that use sequence conservation and biological prior information about protein function. However, once deleterious variants have been identified, how can they be used in breeding? RESULTS We simulated a closed animal breeding population that is subject to both natural selection against deleterious load and artificial selection for a quantitative trait representing the breeding goal. Deleterious load was polygenic and was due to either codominant or recessive variants. We compared strategies for removal of deleterious alleles by genome editing (RAGE) to selection against carriers. When deleterious variants were codominant, the best strategy for prioritizing variants was to prioritize low-frequency variants. When deleterious variants were recessive, the best strategy was to prioritize variants with an intermediate frequency. Selection against carriers was inefficient when variants were codominant, but comparable to editing one variant per sire when variants were recessive. CONCLUSIONS Genome editing of deleterious alleles reduces deleterious load, but requires the simultaneous editing of multiple deleterious variants in the same sire to be effective when deleterious variants are recessive. In the short term, selection against carriers is a possible alternative to genome editing when variants are recessive. Our results suggest that, in the future, there is the potential to use RAGE against deleterious load in animal breeding.
Collapse
Affiliation(s)
- Martin Johnsson
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG Scotland UK
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Box 7023, 750 07 Uppsala, Sweden
| | - R. Chris Gaynor
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG Scotland UK
| | - Janez Jenko
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG Scotland UK
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG Scotland UK
| | - Dirk-Jan de Koning
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Box 7023, 750 07 Uppsala, Sweden
| | - John M. Hickey
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG Scotland UK
| |
Collapse
|
11
|
Rivas MJ, Saura M, Pérez-Figueroa A, Panova M, Johansson T, André C, Caballero A, Rolán-Alvarez E, Johannesson K, Quesada H. Population genomics of parallel evolution in gene expression and gene sequence during ecological adaptation. Sci Rep 2018; 8:16147. [PMID: 30385764 PMCID: PMC6212547 DOI: 10.1038/s41598-018-33897-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 10/08/2018] [Indexed: 11/17/2022] Open
Abstract
Natural selection often produces parallel phenotypic changes in response to a similar adaptive challenge. However, the extent to which parallel gene expression differences and genomic divergence underlie parallel phenotypic traits and whether they are decoupled or not remains largely unexplored. We performed a population genomic study of parallel ecological adaptation among replicate ecotype pairs of the rough periwinkle (Littorina saxatilis) at a regional geographical scale (NW Spain). We show that genomic changes underlying parallel phenotypic divergence followed a complex pattern of both repeatable differences and of differences unique to specific ecotype pairs, in which parallel changes in expression or sequence are restricted to a limited set of genes. Yet, the majority of divergent genes were divergent either for gene expression or coding sequence, but not for both simultaneously. Overall, our findings suggest that divergent selection significantly contributed to the process of parallel molecular differentiation among ecotype pairs, and that changes in expression and gene sequence underlying phenotypic divergence could, at least to a certain extent, be considered decoupled processes.
Collapse
Affiliation(s)
- María José Rivas
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain
| | - María Saura
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain
| | - Andrés Pérez-Figueroa
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain
| | - Marina Panova
- Department of Marine Sciences, Tjärnö, University of Gothenburg, SE-452 96, Strömstad, Sweden
| | - Tomas Johansson
- Department of Biology, University of Lund, SE-223 62, Lund, Sweden
| | - Carl André
- Department of Marine Sciences, Tjärnö, University of Gothenburg, SE-452 96, Strömstad, Sweden
| | - Armando Caballero
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain
| | - Emilio Rolán-Alvarez
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain
| | - Kerstin Johannesson
- Department of Marine Sciences, Tjärnö, University of Gothenburg, SE-452 96, Strömstad, Sweden
| | - Humberto Quesada
- Departamento de Bioquímica, Genética e Inmunología, Universidad de Vigo, 36310, Vigo, Spain.
| |
Collapse
|
12
|
Gul IS, Staal J, Hulpiau P, De Keuckelaere E, Kamm K, Deroo T, Sanders E, Staes K, Driege Y, Saeys Y, Beyaert R, Technau U, Schierwater B, van Roy F. GC Content of Early Metazoan Genes and Its Impact on Gene Expression Levels in Mammalian Cell Lines. Genome Biol Evol 2018; 10:909-917. [PMID: 29608715 PMCID: PMC5952964 DOI: 10.1093/gbe/evy040] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2018] [Indexed: 01/20/2023] Open
Abstract
With the genomes available for many animal clades, including the early-branching metazoans, one can readily study the functional conservation of genes across a diversity of animal lineages. Ectopic expression of an animal protein in, for instance, a mammalian cell line is a generally used strategy in structure–function analysis. However, this might turn out to be problematic in case of distantly related species. Here we analyzed the GC content of the coding sequences of basal animals and show its impact on gene expression levels in human cell lines, and, importantly, how this expression efficiency can be improved. Optimization of the GC3 content in the coding sequences of cadherin, alpha-catenin, and paracaspase of Trichoplax adhaerens dramatically increased the expression of these basal animal genes in human cell lines.
Collapse
Affiliation(s)
- Ismail Sahin Gul
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Jens Staal
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Paco Hulpiau
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Evi De Keuckelaere
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Kai Kamm
- Institut für Tierökologie und Zellbiologie (ITZ), Division of Ecology and Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany
| | - Tom Deroo
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Ellen Sanders
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Katrien Staes
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Yasmine Driege
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Yvan Saeys
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
| | - Rudi Beyaert
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Ulrich Technau
- Department of Molecular Evolution and Development, Faculty of Life Sciences, University of Vienna, Austria
| | - Bernd Schierwater
- Institut für Tierökologie und Zellbiologie (ITZ), Division of Ecology and Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany
| | - Frans van Roy
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| |
Collapse
|
13
|
Savisaar R, Hurst LD. Exonic splice regulation imposes strong selection at synonymous sites. Genome Res 2018; 28:1442-1454. [PMID: 30143596 PMCID: PMC6169883 DOI: 10.1101/gr.233999.117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 07/31/2018] [Indexed: 01/17/2023]
Abstract
What proportion of coding sequence nucleotides have roles in splicing, and how strong is the selection that maintains them? Despite a large body of research into exonic splice regulatory signals, these questions have not been answered. This is because, to our knowledge, previous investigations have not explicitly disentangled the frequency of splice regulatory elements from the strength of the evolutionary constraint under which they evolve. Current data are consistent both with a scenario of weak and diffuse constraint, enveloping large swaths of sequence, as well as with well-defined pockets of strong purifying selection. In the former case, natural selection on exonic splice enhancers (ESEs) might primarily act as a slight modifier of codon usage bias. In the latter, mutations that disrupt ESEs are likely to have large fitness and, potentially, clinical effects. To distinguish between these scenarios, we used several different methods to determine the distribution of selection coefficients for new mutations within ESEs. The analyses converged to suggest that ∼15%-20% of fourfold degenerate sites are part of functional ESEs. Most of these sites are under strong evolutionary constraint. Therefore, exonic splice regulation does not simply impose a weak bias that gently nudges coding sequence evolution in a particular direction. Rather, the selection to preserve these motifs is a strong force that severely constrains the evolution of a substantial proportion of coding nucleotides. Thus synonymous mutations that disrupt ESEs should be considered as a potentially common cause of single-locus genetic disorders.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| |
Collapse
|
14
|
Churbanov A, Abrahamyan L. Preventing Common Hereditary Disorders through Time-Separated Twinning. BIONANOSCIENCE 2018. [DOI: 10.1007/s12668-017-0488-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
15
|
Cavoto E, Neuenschwander S, Goudet J, Perrin N. Sex-antagonistic genes, XY recombination and feminized Y chromosomes. J Evol Biol 2018; 31:416-427. [DOI: 10.1111/jeb.13235] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 12/18/2017] [Accepted: 12/20/2017] [Indexed: 01/20/2023]
Affiliation(s)
- E. Cavoto
- Department of Ecology and Evolution; University of Lausanne; Lausanne Switzerland
| | - S. Neuenschwander
- Department of Ecology and Evolution; University of Lausanne; Lausanne Switzerland
- Vital-IT; Swiss Institute of Bioinformatics; Lausanne Switzerland
| | - J. Goudet
- Department of Ecology and Evolution; University of Lausanne; Lausanne Switzerland
- Swiss Institute of Bioinformatics; Lausanne Switzerland
| | - N. Perrin
- Department of Ecology and Evolution; University of Lausanne; Lausanne Switzerland
| |
Collapse
|
16
|
Graur D. An Upper Limit on the Functional Fraction of the Human Genome. Genome Biol Evol 2017; 9:1880-1885. [PMID: 28854598 PMCID: PMC5570035 DOI: 10.1093/gbe/evx121] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2017] [Indexed: 12/13/2022] Open
Abstract
For the human population to maintain a constant size from generation to generation, an increase in fertility must compensate for the reduction in the mean fitness of the population caused, among others, by deleterious mutations. The required increase in fertility due to this mutational load depends on the number of sites in the genome that are functional, the mutation rate, and the fraction of deleterious mutations among all mutations in functional regions. These dependencies and the fact that there exists a maximum tolerable replacement level fertility can be used to put an upper limit on the fraction of the human genome that can be functional. Mutational load considerations lead to the conclusion that the functional fraction within the human genome cannot exceed 15%.
Collapse
Affiliation(s)
- Dan Graur
- Department of Biology and Biochemistry, University of Houston, TX
| |
Collapse
|
17
|
Abstract
The idea that much of our genome is irrelevant to fitness-is not the product of positive natural selection at the organismal level-remains viable. Claims to the contrary, and specifically that the notion of "junk DNA" should be abandoned, are based on conflating meanings of the word "function". Recent estimates suggest that perhaps 90% of our DNA, though biochemically active, does not contribute to fitness in any sequence-dependent way, and possibly in no way at all. Comparisons to vertebrates with much larger and smaller genomes (the lungfish and the pufferfish) strongly align with such a conclusion, as they have done for the last half-century.
Collapse
Affiliation(s)
- W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.
| | - Tyler D P Brunet
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of History and Philosophy of Science, University of Cambridge, Cambridge, UK
| |
Collapse
|
18
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
19
|
Dutoit L, Burri R, Nater A, Mugal CF, Ellegren H. Genomic distribution and estimation of nucleotide diversity in natural populations: perspectives from the collared flycatcher (Ficedula albicollis) genome. Mol Ecol Resour 2016; 17:586-597. [DOI: 10.1111/1755-0998.12602] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Revised: 09/02/2016] [Accepted: 09/19/2016] [Indexed: 12/30/2022]
Affiliation(s)
- Ludovic Dutoit
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D SE-752 36 Uppsala Sweden
| | - Reto Burri
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D SE-752 36 Uppsala Sweden
| | - Alexander Nater
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D SE-752 36 Uppsala Sweden
| | - Carina F. Mugal
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D SE-752 36 Uppsala Sweden
| | - Hans Ellegren
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Norbyvägen 18D SE-752 36 Uppsala Sweden
| |
Collapse
|
20
|
Gotea V, Gartner JJ, Qutob N, Elnitski L, Samuels Y. The functional relevance of somatic synonymous mutations in melanoma and other cancers. Pigment Cell Melanoma Res 2016; 28:673-84. [PMID: 26300548 DOI: 10.1111/pcmr.12413] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 08/19/2015] [Indexed: 01/07/2023]
Abstract
Recent technological advances in sequencing have flooded the field of cancer research with knowledge about somatic mutations for many different cancer types. Most cancer genomics studies focus on mutations that alter the amino acid sequence, ignoring the potential impact of synonymous mutations. However, accumulating experimental evidence has demonstrated clear consequences for gene function, leading to a widespread recognition of the functional role of synonymous mutations and their causal connection to various diseases. Here, we review the evidence supporting the direct impact of synonymous mutations on gene function via gene splicing; mRNA stability, folding, and translation; protein folding; and miRNA-based regulation of expression. These results highlight the functional contribution of synonymous mutations to oncogenesis and the need to further investigate their detection and prioritization for experimental assessment.
Collapse
Affiliation(s)
- Valer Gotea
- Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Jared J Gartner
- Surgery Branch, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Nouar Qutob
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Laura Elnitski
- Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Yardena Samuels
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
21
|
Figuet E, Nabholz B, Bonneau M, Mas Carrio E, Nadachowska-Brzyska K, Ellegren H, Galtier N. Life History Traits, Protein Evolution, and the Nearly Neutral Theory in Amniotes. Mol Biol Evol 2016; 33:1517-27. [DOI: 10.1093/molbev/msw033] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
|
22
|
Price N, Graur D. Are Synonymous Sites in Primates and Rodents Functionally Constrained? J Mol Evol 2015; 82:51-64. [PMID: 26563252 DOI: 10.1007/s00239-015-9719-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Accepted: 11/04/2015] [Indexed: 11/28/2022]
Abstract
It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.
Collapse
Affiliation(s)
- Nicholas Price
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, 80523, USA.
| | - Dan Graur
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| |
Collapse
|
23
|
Weber CC, Nabholz B, Romiguier J, Ellegren H. Kr/Kc but not dN/dS correlates positively with body mass in birds, raising implications for inferring lineage-specific selection. Genome Biol 2015; 15:542. [PMID: 25607475 PMCID: PMC4264323 DOI: 10.1186/s13059-014-0542-8] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 11/13/2014] [Indexed: 02/02/2023] Open
Abstract
Background The ratio of the rates of non-synonymous and synonymous substitution (dN/dS) is commonly used to estimate selection in coding sequences. It is often suggested that, all else being equal, dN/dS should be lower in populations with large effective size (Ne) due to increased efficacy of purifying selection. As Ne is difficult to measure directly, life history traits such as body mass, which is typically negatively associated with population size, have commonly been used as proxies in empirical tests of this hypothesis. However, evidence of whether the expected positive correlation between body mass and dN/dS is consistently observed is conflicting. Results Employing whole genome sequence data from 48 avian species, we assess the relationship between rates of molecular evolution and life history in birds. We find a negative correlation between dN/dS and body mass, contrary to nearly neutral expectation. This raises the question whether the correlation might be a method artefact. We therefore in turn consider non-stationary base composition, divergence time and saturation as possible explanations, but find no clear patterns. However, in striking contrast to dN/dS, the ratio of radical to conservative amino acid substitutions (Kr/Kc) correlates positively with body mass. Conclusions Our results in principle accord with the notion that non-synonymous substitutions causing radical amino acid changes are more efficiently removed by selection in large populations, consistent with nearly neutral theory. These findings have implications for the use of dN/dS and suggest that caution is warranted when drawing conclusions about lineage-specific modes of protein evolution using this metric. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0542-8) contains supplementary material, which is available to authorized users.
Collapse
|
24
|
Wiberg RAW, Halligan DL, Ness RW, Necsulea A, Kaessmann H, Keightley PD. Assessing Recent Selection and Functionality at Long Noncoding RNA Loci in the Mouse Genome. Genome Biol Evol 2015; 7:2432-44. [PMID: 26272717 PMCID: PMC4558870 DOI: 10.1093/gbe/evv155] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2015] [Indexed: 12/27/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are one of the most intensively studied groups of noncoding elements. Debate continues over what proportion of lncRNAs are functional or merely represent transcriptional noise. Although characterization of individual lncRNAs has identified approximately 200 functional loci across the Eukarya, general surveys have found only modest or no evidence of long-term evolutionary conservation. Although this lack of conservation suggests that most lncRNAs are nonfunctional, the possibility remains that some represent recent evolutionary innovations. We examine recent selection pressures acting on lncRNAs in mouse populations. We compare patterns of within-species nucleotide variation at approximately 10,000 lncRNA loci in a cohort of the wild house mouse, Mus musculus castaneus, with between-species nucleotide divergence from the rat (Rattus norvegicus). Loci under selective constraint are expected to show reduced nucleotide diversity and divergence. We find limited evidence of sequence conservation compared with putatively neutrally evolving ancestral repeats (ARs). Comparisons of sequence diversity and divergence between ARs, protein-coding (PC) exons and lncRNAs, and the associated flanking regions, show weak, but significantly lower levels of sequence diversity and divergence at lncRNAs compared with ARs. lncRNAs conserved deep in the vertebrate phylogeny show lower within-species sequence diversity than lncRNAs in general. A set of 74 functionally characterized lncRNAs show levels of diversity and divergence comparable to PC exons, suggesting that these lncRNAs are under substantial selective constraints. Our results suggest that, in mouse populations, most lncRNA loci evolve at rates similar to ARs, whereas older lncRNAs tend to show signals of selection similar to PC genes.
Collapse
Affiliation(s)
- R Axel W Wiberg
- Institute of Evolutionary Biology, University of Edinburgh, United Kingdom Present address: Centre for Biological Diversity, School of Biology, University of St. Andrews, United Kingdom
| | - Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, United Kingdom
| | - Rob W Ness
- Institute of Evolutionary Biology, University of Edinburgh, United Kingdom
| | - Anamaria Necsulea
- School of Life Sciences, Ecole Polytechnique Fédérale Lausanne, Lausanne, Switzerland
| | - Henrik Kaessmann
- Center for Integrative Genomics, University of Lausanne, Switzerland
| | - Peter D Keightley
- Institute of Evolutionary Biology, University of Edinburgh, United Kingdom
| |
Collapse
|
25
|
Uchimura A, Higuchi M, Minakuchi Y, Ohno M, Toyoda A, Fujiyama A, Miura I, Wakana S, Nishino J, Yagi T. Germline mutation rates and the long-term phenotypic effects of mutation accumulation in wild-type laboratory mice and mutator mice. Genome Res 2015; 25:1125-34. [PMID: 26129709 PMCID: PMC4509997 DOI: 10.1101/gr.186148.114] [Citation(s) in RCA: 109] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 05/30/2015] [Indexed: 12/19/2022]
Abstract
The germline mutation rate is an important parameter that affects the amount of genetic variation and the rate of evolution. However, neither the rate of germline mutations in laboratory mice nor the biological significance of the mutation rate in mammalian populations is clear. Here we studied genome-wide mutation rates and the long-term effects of mutation accumulation on phenotype in more than 20 generations of wild-type C57BL/6 mice and mutator mice, which have high DNA replication error rates. We estimated the base-substitution mutation rate to be 5.4 × 10−9 (95% confidence interval = 4.6 × 10−9–6.5 × 10−9) per nucleotide per generation in C57BL/6 laboratory mice, about half the rate reported in humans. The mutation rate in mutator mice was 17 times that in wild-type mice. Abnormal phenotypes were 4.1-fold more frequent in the mutator lines than in the wild-type lines. After several generations, the mutator mice reproduced at substantially lower rates than the controls, exhibiting low pregnancy rates, lower survival rates, and smaller litter sizes, and many of the breeding lines died out. These results provide fundamental information about mouse genetics and reveal the impact of germline mutation rates on phenotypes in a mammalian population.
Collapse
Affiliation(s)
- Arikuni Uchimura
- KOKORO-Biology Group, Laboratories for Integrated Biology, Graduate School of Frontier Biosciences, Osaka University, Suita 565-0871, Japan
| | - Mayumi Higuchi
- KOKORO-Biology Group, Laboratories for Integrated Biology, Graduate School of Frontier Biosciences, Osaka University, Suita 565-0871, Japan
| | - Yohei Minakuchi
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima 411-8540, Japan
| | - Mizuki Ohno
- Department of Medical Biophysics and Radiation Biology, Faculty of Medical Sciences, Kyushu University, Fukuoka 812-8582, Japan
| | - Atsushi Toyoda
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima 411-8540, Japan
| | - Asao Fujiyama
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima 411-8540, Japan
| | - Ikuo Miura
- Technology and Development Team for Mouse Phenotype Analysis, Japan Mouse Clinic, RIKEN BioResource Center, Tsukuba 305-0074, Japan
| | - Shigeharu Wakana
- Technology and Development Team for Mouse Phenotype Analysis, Japan Mouse Clinic, RIKEN BioResource Center, Tsukuba 305-0074, Japan
| | - Jo Nishino
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan
| | - Takeshi Yagi
- KOKORO-Biology Group, Laboratories for Integrated Biology, Graduate School of Frontier Biosciences, Osaka University, Suita 565-0871, Japan
| |
Collapse
|
26
|
Koufopanou V, Lomas S, Tsai IJ, Burt A. Estimating the Fitness Effects of New Mutations in the Wild Yeast Saccharomyces paradoxus. Genome Biol Evol 2015; 7:1887-95. [PMID: 26085542 PMCID: PMC4524479 DOI: 10.1093/gbe/evv112] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The nature of selection acting on a population is in large measure determined by the distribution of fitness effects of new mutations. In this study, we use DNA sequences from four closely related clades of Saccharomyces paradoxus and Saccharomyces cerevisiae to identify and polarize new mutations and estimate their fitness effects. By progressively restricting the analyses to narrower categories of sites, we further seek to characterize sites with predictable mutational effects, that is, unconditionally deleterious, neutral or beneficial. Consistent with previous studies on S. paradoxus, we have failed to find evidence for mutations with beneficial effects, even in regions that were divergent in two outgroup clades, perhaps a consequence of the relatively unchallenged, predominantly asexual and highly inbred lifestyle of this species. On the other hand, there is abundant evidence of deleterious mutations, varying in severity of effect from strongly deleterious to very mild, particularly in regions conserved in the outgroup taxa, indicating a history of persistent purifying selection. Narrowing the analysis down to individual amino acids reduces further the range of effects: for example, mutations changing cysteine are predicted to be nearly always strongly deleterious, whereas those changing arginine, serine, and tyrosine are expected to be nearly neutral. The proportion of mutations with deleterious effects for a particular amino acid is correlated with long-term stasis of that amino acid among highly divergent sequences from a variety of organisms, showing that functionality of sites tends to persist through the diversification of clades and that our findings are also relevant to longer evolutionary times and other taxa.
Collapse
Affiliation(s)
- Vassiliki Koufopanou
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, Berks, United Kingdom
| | - Susan Lomas
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, Berks, United Kingdom
| | - Isheng J Tsai
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, Berks, United Kingdom Present address: Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Austin Burt
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, Berks, United Kingdom
| |
Collapse
|
27
|
Hayward AD, Lummaa V, Bazykin GA. Fitness Consequences of Advanced Ancestral Age over Three Generations in Humans. PLoS One 2015; 10:e0128197. [PMID: 26030274 PMCID: PMC4451146 DOI: 10.1371/journal.pone.0128197] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 04/24/2015] [Indexed: 11/18/2022] Open
Abstract
A rapid rise in age at parenthood in contemporary societies has increased interest in reports of higher prevalence of de novo mutations and health problems in individuals with older fathers, but the fitness consequences of such age effects over several generations remain untested. Here, we use extensive pedigree data on seven pre-industrial Finnish populations to show how the ages of ancestors for up to three generations are associated with fitness traits. Individuals whose fathers, grandfathers and great-grandfathers fathered their lineage on average under age 30 were ~13% more likely to survive to adulthood than those whose ancestors fathered their lineage at over 40 years. In addition, females had a lower probability of marriage if their male ancestors were older. These findings are consistent with an increase of the number of accumulated de novo mutations with male age, suggesting that deleterious mutations acquired from recent ancestors may be a substantial burden to fitness in humans. However, possible non-mutational explanations for the observed associations are also discussed.
Collapse
Affiliation(s)
- Adam D Hayward
- Department of Animal and Plant Sciences, Alfred Denny Building, University of Sheffield, Western Bank, Sheffield, S10 2TN, United Kingdom; Institute of Evolutionary Biology, University of Edinburgh, Charlotte Auerbach Road, Edinburgh, EH9 3FL, United Kingdom
| | - Virpi Lummaa
- Department of Animal and Plant Sciences, Alfred Denny Building, University of Sheffield, Western Bank, Sheffield, S10 2TN, United Kingdom
| | - Georgii A Bazykin
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), Bolshoy Karetny pereulok 19, Moscow, 127994, Russia; Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Vorbyevy Gory 1-73, Moscow, 119992, Russia; Belozersky Institute for Physical and Chemical Biology, Lomonosov Moscow State University, Vorbyevy Gory 1-40, Moscow, 119992, Russia; Pirogov Russian National Research Medical University, Ul. Ostrovityanova 1, Moscow, 117997, Russia
| |
Collapse
|
28
|
Wu X, Hurst LD. Why Selection Might Be Stronger When Populations Are Small: Intron Size and Density Predict within and between-Species Usage of Exonic Splice Associated cis-Motifs. Mol Biol Evol 2015; 32:1847-61. [PMID: 25771198 PMCID: PMC4476162 DOI: 10.1093/molbev/msv069] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The nearly neutral theory predicts that small effective population size provides the conditions for weakened selection. This is postulated to explain why our genome is more “bloated” than that of, for example, yeast, ours having large introns and large intergene spacer. If a bloated genome is also an error prone genome might it, however, be the case that selection for error-mitigating properties is stronger in our genome? We examine this notion using splicing as an exemplar, not least because large introns can predispose to noisy splicing. We thus ask whether, owing to genomic decay, selection for splice error-control mechanisms is stronger, not weaker, in species with large introns and small populations. In humans much information defining splice sites is in cis-exonic motifs, most notably exonic splice enhancers (ESEs). These act as splice-error control elements. Here then we ask whether within and between-species intron size is a predictor of the commonality of exonic cis-splicing motifs. We show that, as predicted, the proportion of synonymous sites that are ESE-associated and under selection in humans is weakly positively correlated with the size of the flanking intron. In a phylogenetically controlled framework, we observe, also as expected, that mean intron size is both predicted by Ne.μ and is a good predictor of cis-motif usage across species, this usage coevolving with splice site definition. Unexpectedly, however, across taxa intron density is a better predictor of cis-motif usage than intron size. We propose that selection for splice-related motifs is driven by a need to avoid decoy splice sites that will be more common in genes with many and large introns. That intron number and density predict ESE usage within human genes is consistent with this, as is the finding of intragenic heterogeneity in ESE density. As intronic content and splice site usage across species is also well predicted by Ne.μ, the result also suggests an unusual circumstance in which selection (for cis-modifiers of splicing) might be stronger when population sizes are smaller, as here splicing is noisier, resulting in a greater need to control error-prone splicing.
Collapse
Affiliation(s)
- XianMing Wu
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| |
Collapse
|
29
|
Abstract
Under the traditional mutation load model based on multiplicative fitness effects, the load in a population is 1−e−U, where U is the genomic deleterious mutation rate. Because this load becomes high under large U, synergistic epistasis has been proposed as one possible means of reducing the load. However, experiments on model organisms attempting to detect synergistic epistasis have often focused on a quadratic fitness model, with the resulting general conclusion being that epistasis is neither common nor strong. Here, I present a model of additive fitness effects and show that, unlike multiplicative effects, the equilibrium frequency of an allele under additivity is dependent on the average absolute fitness of the population. The additive model then results in a load of U/(U +1), which is much lower than 1−e−U for large U. Numerical iterations demonstrate that this analytic derivation holds as a good approximation under biologically relevant values of selection coefficients and U. Additionally, regressions onto Drosophila mutation accumulation data suggest that the common method of inferring epistasis by detecting large quadratic terms from regressions is not always necessary, as the additive model fits the data well and results in synergistic epistasis. Furthermore, the additive model gives a much larger reduction in load than the quadratic model when predicted from the same data, indicating that it is important to consider this additive model in addition to the quadratic model when inferring epistasis from mutation accumulation data.
Collapse
|
30
|
Causes of natural variation in fitness: evidence from studies of Drosophila populations. Proc Natl Acad Sci U S A 2015; 112:1662-9. [PMID: 25572964 DOI: 10.1073/pnas.1423275112] [Citation(s) in RCA: 133] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
DNA sequencing has revealed high levels of variability within most species. Statistical methods based on population genetics theory have been applied to the resulting data and suggest that most mutations affecting functionally important sequences are deleterious but subject to very weak selection. Quantitative genetic studies have provided information on the extent of genetic variation within populations in traits related to fitness and the rate at which variability in these traits arises by mutation. This paper attempts to combine the available information from applications of the two approaches to populations of the fruitfly Drosophila in order to estimate some important parameters of genetic variation, using a simple population genetics model of mutational effects on fitness components. Analyses based on this model suggest the existence of a class of mutations with much larger fitness effects than those inferred from sequence variability and that contribute most of the standing variation in fitness within a population caused by the input of mildly deleterious mutations. However, deleterious mutations explain only part of this standing variation, and other processes such as balancing selection appear to make a large contribution to genetic variation in fitness components in Drosophila.
Collapse
|
31
|
Furusawa M. The disparity mutagenesis model predicts rescue of living things from catastrophic errors. Front Genet 2014; 5:421. [PMID: 25538731 PMCID: PMC4255596 DOI: 10.3389/fgene.2014.00421] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 11/17/2014] [Indexed: 01/24/2023] Open
Abstract
In animals including humans, mutation rates per generation exceed a perceived threshold, and excess mutations increase genetic load. Despite this, animals have survived without extinction. This is a perplexing problem for animal and human genetics, arising at the end of the last century, and to date still does not have a fully satisfactory explanation. Shortly after we proposed the disparity theory of evolution in 1992, the disparity mutagenesis model was proposed, which forms the basis for an explanation for an acceleration of evolution and species survival. This model predicts a significant increase of the mutation threshold values if the fidelity difference in replication between the lagging and leading strands is high enough. When applied to biological evolution, the model predicts that living things, including humans, might overcome the lethal effect of accumulated deleterious mutations and be able to survive. Artificially derived mutator strains of microorganisms, in which an enhanced lagging-strand-biased mutagenesis was introduced, showed unexpectedly high adaptability to severe environments. The implications of the striking behaviors shown by these disparity mutators will be discussed in relation to how living things with high mutation rates can avoid the self-defeating risk of excess mutations.
Collapse
|
32
|
Siepel A, Arbiza L. Cis-regulatory elements and human evolution. Curr Opin Genet Dev 2014; 29:81-9. [PMID: 25218861 PMCID: PMC4258466 DOI: 10.1016/j.gde.2014.08.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 08/17/2014] [Accepted: 08/23/2014] [Indexed: 11/20/2022]
Abstract
Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider 'new frontiers' in this field stemming from recent research on transcriptional regulation.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
33
|
Kessler MD, Dean MD. Effective population size does not predict codon usage bias in mammals. Ecol Evol 2014; 4:3887-900. [PMID: 25505518 PMCID: PMC4242573 DOI: 10.1002/ece3.1249] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 08/04/2014] [Accepted: 08/07/2014] [Indexed: 12/20/2022] Open
Abstract
Synonymous codons are not used at equal frequency throughout the genome, a phenomenon termed codon usage bias (CUB). It is often assumed that interspecific variation in the intensity of CUB is related to species differences in effective population sizes (Ne), with selection on CUB operating less efficiently in species with small Ne. Here, we specifically ask whether variation in Ne predicts differences in CUB in mammals and report two main findings. First, across 41 mammalian genomes, CUB was not correlated with two indirect proxies of Ne (body mass and generation time), even though there was statistically significant evidence of selection shaping CUB across all species. Interestingly, autosomal genes showed higher codon usage bias compared to X-linked genes, and high-recombination genes showed higher codon usage bias compared to low recombination genes, suggesting intraspecific variation in Ne predicts variation in CUB. Second, across six mammalian species with genetic estimates of Ne (human, chimpanzee, rabbit, and three mouse species: Mus musculus, M. domesticus, and M. castaneus), Ne and CUB were weakly and inconsistently correlated. At least in mammals, interspecific divergence in Ne does not strongly predict variation in CUB. One hypothesis is that each species responds to a unique distribution of selection coefficients, confounding any straightforward link between Ne and CUB.
Collapse
Affiliation(s)
- Michael D Kessler
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| |
Collapse
|
34
|
Abstract
Some species exhibit very high levels of DNA sequence variability; there is also evidence for the existence of heritable epigenetic variants that experience state changes at a much higher rate than sequence variants. In both cases, the resulting high diversity levels within a population (hyperdiversity) mean that standard population genetics methods are not trustworthy. We analyze a population genetics model that incorporates purifying selection, reversible mutations, and genetic drift, assuming a stationary population size. We derive analytical results for both population parameters and sample statistics and discuss their implications for studies of natural genetic and epigenetic variation. In particular, we find that (1) many more intermediate-frequency variants are expected than under standard models, even with moderately strong purifying selection, and (2) rates of evolution under purifying selection may be close to, or even exceed, neutral rates. These findings are related to empirical studies of sequence and epigenetic variation.
Collapse
|
35
|
How much of the variation in the mutation rate along the human genome can be explained? G3-GENES GENOMES GENETICS 2014; 4:1667-70. [PMID: 24996580 PMCID: PMC4169158 DOI: 10.1534/g3.114.012849] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
It has been claimed recently that it may be possible to predict the rate of de novo mutation of each site in the human genome with a high degree of accuracy [Michaelson et al. (2012), Cell 151: 1431−1442]. We show that this claim is unwarranted. By considering the correlation between the rate of de novo mutation and the predictions from the model of Michaelson et al., we show there could be substantial unexplained variance in the mutation rate. We investigate whether the model of Michaelson et al. captures variation at the single nucleotide level that is not due to simple context. We show that the model captures a substantial fraction of this variation at CpG dinucleotides but fails to explain much of the variation at non-CpG sites.
Collapse
|
36
|
Abstract
A major biomedical advance from recent years was the finding that gene expression and phenotypic traits may be shaped by potentially reversible and heritable modifications that occur without altering the sequence of the nucleotides, and became known as epigenetic changes. The term 'epigenetics' dates back to the 1940s, when it was first used in context of cellular differentiation decisions that are made during development. Since then, our understanding of epigenetic modifications that govern development and disease expanded considerably. The contribution of epigenetic changes to shaping phenotypes brings at least two major clinically relevant benefits. One of these, stemming from the reversibility of epigenetic changes, involves the possibility to therapeutically revert epigenetic marks to re-establish prior gene expression patterns. The strength and the potential of this strategy are illustrated by the first four epigenetic drugs that were approved in recent years and by the additional candidates that are at various stages in preclinical studies and clinical trials. The second particularity is the finding that epigenetic changes precede the appearance of histopathological modifications. This has the potential to facilitate the emergence of epigenetic biomarkers, some of which already entered the clinical arena, catalysing a major shift in prophylactic and therapeutic strategies, and promising to fill a decades-old gap in preventive medicine.
Collapse
Affiliation(s)
- R A Stein
- Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY, USA
| |
Collapse
|
37
|
Hunt RC, Simhadri VL, Iandoli M, Sauna ZE, Kimchi-Sarfaty C. Exposing synonymous mutations. Trends Genet 2014; 30:308-21. [PMID: 24954581 DOI: 10.1016/j.tig.2014.04.006] [Citation(s) in RCA: 231] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Revised: 04/16/2014] [Accepted: 04/17/2014] [Indexed: 12/12/2022]
Abstract
Synonymous codon changes, which do not alter protein sequence, were previously thought to have no functional consequence. Although this concept has been overturned in recent years, there is no unique mechanism by which these changes exert biological effects. A large repertoire of both experimental and bioinformatic methods has been developed to understand the effects of synonymous variants. Results from this body of work have provided global insights into how biological systems exploit the degeneracy of the genetic code to control gene expression, protein folding efficiency, and the coordinated expression of functionally related gene families. Although it is now clear that synonymous variants are important in a variety of contexts, from human disease to the safety and efficacy of therapeutic proteins, there is no clear consensus on the approaches to identify and validate these changes. Here, we review the diverse methods to understand the effects of synonymous mutations.
Collapse
Affiliation(s)
- Ryan C Hunt
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| | - Vijaya L Simhadri
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA
| | - Matthew Iandoli
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA
| | - Zuben E Sauna
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| | - Chava Kimchi-Sarfaty
- Division of Hematology, Center for Biologics Evaluation and Research, Food and Drug Administration, Bethesda, MD, USA.
| |
Collapse
|
38
|
Affiliation(s)
- Alexander F. Palazzo
- University of Toronto, Department of Biochemistry, Toronto, Ontario, Canada
- * E-mail: (AP); (TG)
| | - T. Ryan Gregory
- University of Guelph, Department of Integrative Biology, Guelph, Ontario, Canada
- * E-mail: (AP); (TG)
| |
Collapse
|
39
|
Abstract
Evolutionary conservation has been an accurate predictor of functional elements across the first decade of metazoan genomics. More recently, there has been a move to define functional elements instead from biochemical annotations. Evolutionary methods are, however, more comprehensive than biochemical approaches can be and can assess quantitatively, especially for subtle effects, how biologically important--how injurious after mutation--different types of elements are. Evolutionary methods are thus critical for understanding the large fraction (up to 10%) of the human genome that does not encode proteins and yet might convey function. These methods can also capture the ephemeral nature of much noncoding functional sequence, with large numbers of functional elements having been gained and lost rapidly along each mammalian lineage. Here, we review how different strengths of purifying selection have impacted on protein-coding and non-protein-coding loci and on transcription factor binding sites in mammalian and fruit fly genomes.
Collapse
Affiliation(s)
- Wilfried Haerty
- MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom; ,
| | | |
Collapse
|
40
|
Implications of human genome structural heterogeneity: functionally related genes tend to reside in organizationally similar genomic regions. BMC Genomics 2014; 15:252. [PMID: 24684786 PMCID: PMC4234528 DOI: 10.1186/1471-2164-15-252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 03/21/2014] [Indexed: 01/30/2023] Open
Abstract
Background In an earlier study, we hypothesized that genomic segments with different sequence
organization patterns (OPs) might display functional specificity despite their
similar GC content. Here we tested this hypothesis by dividing the human genome
into 100 kb segments, classifying these segments into five compositional
groups according to GC content, and then characterizing each segment within the
five groups by oligonucleotide counting (k-mer analysis; also referred to as
compositional spectrum analysis, or CSA), to examine the distribution of sequence
OPs in the segments. We performed the CSA on the entire DNA, i.e., its coding and
non-coding parts the latter being much more abundant in the genome than the
former. Results We identified 38 OP-type clusters of segments that differ in their compositional
spectrum (CS) organization. Many of the segments that shared the same OP type were
enriched with genes related to the same biological processes (developmental,
signaling, etc.), components of biochemical complexes, or organelles. Thirteen
OP-type clusters showed significant enrichment in genes connected to specific
gene-ontology terms. Some of these clusters seemed to reflect certain events
during periods of horizontal gene transfer and genome expansion, and subsequent
evolution of genomic regions requiring coordinated regulation. Conclusions There may be a tendency for genes that are involved in the same biological
process, complex or organelle to use the same OP, even at a distance of ~
100 kb from the genes. Although the intergenic DNA is non-coding, the general
pattern of sequence organization (e.g., reflected in over-represented
oligonucleotide “words”) may be important and were protected, to some
extent, in the course of evolution.
Collapse
|
41
|
Abstract
The causes of the large effect of the X chromosome in reproductive isolation and speciation have long been debated. The faster-X hypothesis predicts that X-linked loci are expected to have higher rates of adaptive evolution than autosomal loci if new beneficial mutations are on average recessive. Reproductive isolation should therefore evolve faster when contributing loci are located on the X chromosome. In this study, we have analyzed genome-wide nucleotide polymorphism data from the house mouse subspecies Mus musculus castaneus and nucleotide divergence from Mus famulus and Rattus norvegicus to compare rates of adaptive evolution for autosomal and X-linked protein-coding genes. We found significantly faster adaptive evolution for X-linked loci, particularly for genes with expression in male-specific tissues, but autosomal and X-linked genes with expression in female-specific tissues evolve at similar rates. We also estimated rates of adaptive evolution for genes expressed during spermatogenesis and found that X-linked genes that escape meiotic sex chromosome inactivation (MSCI) show rapid adaptive evolution. Our results suggest that faster-X adaptive evolution is either due to net recessivity of new advantageous mutations or due to a special gene content of the X chromosome, which regulates male function and spermatogenesis. We discuss how our results help to explain the large effect of the X chromosome in speciation.
Collapse
|
42
|
Cáceres EF, Hurst LD. The evolution, impact and properties of exonic splice enhancers. Genome Biol 2013; 14:R143. [PMID: 24359918 PMCID: PMC4054783 DOI: 10.1186/gb-2013-14-12-r143] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Accepted: 12/20/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In humans, much of the information specifying splice sites is not at the splice site. Exonic splice enhancers are one of the principle non-splice site motifs. Four high-throughput studies have provided a compendium of motifs that function as exonic splice enhancers, but only one, RESCUE-ESE, has been generally employed to examine the properties of enhancers. Here we consider these four datasets to ask whether there is any consensus on the properties and impacts of exonic splice enhancers. RESULTS While only about 1% of all the identified hexamer motifs are common to all analyses we can define reasonably sized sets that are found in most datasets. These consensus intersection datasets we presume reflect the true properties of exonic splice enhancers. Given prior evidence for the properties of enhancers and splice-associated mutations, we ask for all datasets whether the exonic splice enhancers considered are purine enriched; enriched near exon boundaries; able to predict trends in relative codon usage; slow evolving at synonymous sites; rare in SNPs; associated with weak splice sites; and enriched near longer introns. While the intersect datasets match expectations, only one original dataset, RESCUE-ESE, does. Unexpectedly, a fully experimental dataset identifies motifs that commonly behave opposite to the consensus, for example, being enriched in exon cores where splice-associated mutations are rare. CONCLUSIONS Prior analyses that used the RESCUE-ESE set of hexamers captured the properties of consensus exonic splice enhancers. We estimate that at least 4% of synonymous mutations are deleterious owing to an effect on enhancer functioning.
Collapse
|
43
|
Halligan DL, Kousathanas A, Ness RW, Harr B, Eöry L, Keane TM, Adams DJ, Keightley PD. Contributions of protein-coding and regulatory change to adaptive molecular evolution in murid rodents. PLoS Genet 2013; 9:e1003995. [PMID: 24339797 PMCID: PMC3854965 DOI: 10.1371/journal.pgen.1003995] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Accepted: 10/16/2013] [Indexed: 12/22/2022] Open
Abstract
The contribution of regulatory versus protein change to adaptive evolution has long been controversial. In principle, the rate and strength of adaptation within functional genetic elements can be quantified on the basis of an excess of nucleotide substitutions between species compared to the neutral expectation or from effects of recent substitutions on nucleotide diversity at linked sites. Here, we infer the nature of selective forces acting in proteins, their UTRs and conserved noncoding elements (CNEs) using genome-wide patterns of diversity in wild house mice and divergence to related species. By applying an extension of the McDonald-Kreitman test, we infer that adaptive substitutions are widespread in protein-coding genes, UTRs and CNEs, and we estimate that there are at least four times as many adaptive substitutions in CNEs and UTRs as in proteins. We observe pronounced reductions in mean diversity around nonsynonymous sites (whether or not they have experienced a recent substitution). This can be explained by selection on multiple, linked CNEs and exons. We also observe substantial dips in mean diversity (after controlling for divergence) around protein-coding exons and CNEs, which can also be explained by the combined effects of many linked exons and CNEs. A model of background selection (BGS) can adequately explain the reduction in mean diversity observed around CNEs. However, BGS fails to explain the wide reductions in mean diversity surrounding exons (encompassing ∼100 Kb, on average), implying that there is a substantial role for adaptation within exons or closely linked sites. The wide dips in diversity around exons, which are hard to explain by BGS, suggest that the fitness effects of adaptive amino acid substitutions could be substantially larger than substitutions in CNEs. We conclude that although there appear to be many more adaptive noncoding changes, substitutions in proteins may dominate phenotypic evolution. We present an analysis of the genome sequences of multiple wild house mice. Wild house mice are about ten times more genetically diverse than humans, reflecting the large effective population size of the species. This manifests itself as more effective natural selection acting against deleterious mutations and favouring advantageous mutations in mice than in humans. We show that there are strong signals of adaptive evolution at many sites in the genome. We estimate that 80% of adaptive changes in the genome are in gene regulatory elements and only 20% are in protein-coding genes. We find that nucleotide diversity is markedly reduced close to gene regulatory elements and protein-coding gene sequences. The reductions around regulatory elements can be explained by selection purging deleterious mutations that occur in the elements themselves, but this process only partially explains the diversity reductions around protein-coding genes. Recurrent adaptive evolution, which can also cause local reductions in diversity via selective sweeps, may be necessary to fully explain the patterns in diversity that we observe surrounding genes. Although most adaptive molecular evolution appears to be regulatory, adaptive phenotypic change may principally be driven by structural change in proteins.
Collapse
Affiliation(s)
- Daniel L. Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | | | - Rob W. Ness
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Bettina Harr
- Max-Planck Institute for Evolutionary Biology, Plön, Germany
| | - Lél Eöry
- The Roslin Institute and R(D)SVS, University of Edinburgh, Midlothian, United Kingdom
| | - Thomas M. Keane
- The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom
| | - David J. Adams
- The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom
| | - Peter D. Keightley
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
44
|
Hough J, Williamson RJ, Wright SI. Patterns of Selection in Plant Genomes. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2013. [DOI: 10.1146/annurev-ecolsys-110512-135851] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Plants show a wide range of variation in mating system, ploidy level, and demographic history, allowing for unique opportunities to investigate the evolutionary and genetic factors affecting genome-wide patterns of positive and negative selection. In this review, we highlight recent progress in our understanding of the extent and nature of selection on plant genomes. We discuss differences in selection as they relate to variation in demography, recombination, mating system, and ploidy. We focus on the population genetic consequences of these factors and argue that, although variation in the magnitude of purifying selection is well documented, quantifying rates of positive selection and disentangling the relative importance of recombination, demography, and ploidy are ongoing challenges. Large-scale comparative studies that examine the relative and joint importance of these processes, combined with explicit models of population history and selection, are key and feasible goals for future work.
Collapse
Affiliation(s)
- Josh Hough
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada, M5S 3B2;, ,
| | - Robert J. Williamson
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada, M5S 3B2;, ,
| | - Stephen I. Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada, M5S 3B2;, ,
| |
Collapse
|
45
|
Charlesworth B. Why we are not dead one hundred times over. Evolution 2013; 67:3354-61. [PMID: 24152012 DOI: 10.1111/evo.12195] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Accepted: 06/14/2013] [Indexed: 12/25/2022]
Abstract
The possibility of pervasive weak selection at tens or hundreds of millions of sites across the genome, suggested by recent studies of silent site DNA sequence variation and divergence, raises the problem of the survival of the population in the face of the large genetic load that may result. Two alternative resolutions of this problem are presented for populations where recombination is sufficiently frequent that different sites under selection evolve independently. One invokes weak stabilizing selection, of the magnitude compatible with abundant silent site variability. This can be shown to produce only a modest genetic load, due to the effectiveness of even weak stabilizing selection in keeping the trait mean close to the optimum. The other invokes soft selection, whereby individuals compete for a limiting resource whose abundance determines the absolute fitness of the population. Weak purifying selection at a large number of sites produces only a small variance in fitness among individuals within the population, due to the fact that most sites are fixed rather than polymorphic. Even when it produces a large genetic load, it is compatible with the observations on fitness variance when selection is soft. It may be very difficult to distinguish between these two possibilities.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom.
| |
Collapse
|
46
|
Arbiza L, Gronau I, Aksoy BA, Hubisz MJ, Gulko B, Keinan A, Siepel A. Genome-wide inference of natural selection on human transcription factor binding sites. Nat Genet 2013; 45:723-9. [PMID: 23749186 DOI: 10.1038/ng.2658] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 05/08/2013] [Indexed: 11/09/2022]
Abstract
For decades, it has been hypothesized that gene regulation has had a central role in human evolution, yet much remains unknown about the genome-wide impact of regulatory mutations. Here we use whole-genome sequences and genome-wide chromatin immunoprecipitation and sequencing data to demonstrate that natural selection has profoundly influenced human transcription factor binding sites since the divergence of humans from chimpanzees 4-6 million years ago. Our analysis uses a new probabilistic method, called INSIGHT, for measuring the influence of selection on collections of short, interspersed noncoding elements. We find that, on average, transcription factor binding sites have experienced somewhat weaker selection than protein-coding genes. However, the binding sites of several transcription factors show clear evidence of adaptation. Several measures of selection are strongly correlated with predicted binding affinity. Overall, regulatory elements seem to contribute substantially to both adaptive substitutions and deleterious polymorphisms with key implications for human evolution and disease.
Collapse
Affiliation(s)
- Leonardo Arbiza
- Department of Biological Statistics & Computational Biology, Cornell University, Ithaca, NY, USA
| | | | | | | | | | | | | |
Collapse
|
47
|
Strong purifying selection at synonymous sites in D. melanogaster. PLoS Genet 2013; 9:e1003527. [PMID: 23737754 PMCID: PMC3667748 DOI: 10.1371/journal.pgen.1003527] [Citation(s) in RCA: 144] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/08/2013] [Indexed: 11/19/2022] Open
Abstract
Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in Drosophila melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated.
Collapse
|
48
|
Abstract
The evolution of sex is one of the most important and controversial problems in evolutionary biology. Although sex is almost universal in higher animals and plants, its inherent costs have made its maintenance difficult to explain. The most famous of these is the twofold cost of males, which can greatly reduce the fecundity of a sexual population, compared to a population of asexual females. Over the past century, multiple hypotheses, along with experimental evidence to support these, have been put forward to explain widespread costly sex. In this review, we outline some of the most prominent theories, along with the experimental and observational evidence supporting these. Historically, there have been 4 classes of theories: the ability of sex to fix multiple novel advantageous mutants (Fisher-Muller hypothesis); sex as a mechanism to stop the build-up of deleterious mutations in finite populations (Muller's ratchet); recombination creating novel genotypes that can resist infection by parasites (Red Queen hypothesis); and the ability of sex to purge bad genomes if deleterious mutations act synergistically (mutational deterministic hypothesis). Current theoretical and experimental evidence seems to favor the hypothesis that sex breaks down selection interference between new mutants, or it acts as a mechanism to shuffle genotypes in order to repel parasitic invasion. However, there is still a need to collect more data from natural populations and experimental studies, which can be used to test different hypotheses.
Collapse
Affiliation(s)
- Matthew Hartfield
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | |
Collapse
|
49
|
Agrawal AF, Whitlock MC. Mutation Load: The Fitness of Individuals in Populations Where Deleterious Alleles Are Abundant. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2012. [DOI: 10.1146/annurev-ecolsys-110411-160257] [Citation(s) in RCA: 137] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many multicellular eukaryotes have reasonably high per-generation mutation rates. Consequently, most populations harbor an abundance of segregating deleterious alleles. These alleles, most of which are of small effect individually, collectively can reduce substantially the fitness of individuals relative to what it would be otherwise; this is mutation load. Mutation load can be lessened by any factor that causes more mutations to be removed per selective death, such as inbreeding, synergistic epistasis, population structure, or harsh environments. The ecological effects of load are not clear-cut because some conditions (such as selection early in life, sexual selection, reproductive compensation, and intraspecific competition) reduce the effects of load on population size and persistence, but other conditions (such as interspecific competition and load on resource use efficiency) can cause small amounts of load to have strong effects on the population, even extinction. We suggest a series of studies to improve our understanding of the effects of mutation load.
Collapse
Affiliation(s)
- Aneil F. Agrawal
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada M5S 3B2
| | - Michael C. Whitlock
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
| |
Collapse
|
50
|
Hartfield M, Otto SP, Keightley PD. THE MAINTENANCE OF OBLIGATE SEX IN FINITE, STRUCTURED POPULATIONS SUBJECT TO RECURRENT BENEFICIAL AND DELETERIOUS MUTATION. Evolution 2012. [DOI: 10.1111/j.1558-5646.2012.01733.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|