1
|
Marsh JI, Johri P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection. Mol Biol Evol 2024; 41:msae118. [PMID: 38874402 PMCID: PMC11245712 DOI: 10.1093/molbev/msae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
Collapse
Affiliation(s)
- Jacob I Marsh
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Parul Johri
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
2
|
Charmouh AP, Bocedi G, Hartfield M. Inferring the distributions of fitness effects and proportions of strongly deleterious mutations. G3 (BETHESDA, MD.) 2023; 13:jkad140. [PMID: 37337692 PMCID: PMC10468728 DOI: 10.1093/g3journal/jkad140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 06/05/2023] [Accepted: 06/05/2023] [Indexed: 06/21/2023]
Abstract
The distribution of fitness effects is a key property in evolutionary genetics as it has implications for several evolutionary phenomena including the evolution of sex and mating systems, the rate of adaptive evolution, and the prevalence of deleterious mutations. Despite the distribution of fitness effects being extensively studied, the effects of strongly deleterious mutations are difficult to infer since such mutations are unlikely to be present in a sample of haplotypes, so genetic data may contain very little information about them. Recent work has attempted to correct for this issue by expanding the classic gamma-distributed model to explicitly account for strongly deleterious mutations. Here, we use simulations to investigate one such method, adding a parameter (plth) to capture the proportion of strongly deleterious mutations. We show that plth can improve the model fit when applied to individual species but underestimates the true proportion of strongly deleterious mutations. The parameter can also artificially maximize the likelihood when used to jointly infer a distribution of fitness effects from multiple species. As plth and related parameters are used in current inference algorithms, our results are relevant with respect to avoiding model artifacts and improving future tools for inferring the distribution of fitness effects.
Collapse
Affiliation(s)
- Anders P Charmouh
- School of Biological Sciences, University of Aberdeen, Aberdeen AB24 3FX, UK
- Bioinformatics Research Centre Aarhus University, University City 81, building 1872, 3rd floor. DK-8000 Aarhus C, Denmark
| | - Greta Bocedi
- School of Biological Sciences, University of Aberdeen, Aberdeen AB24 3FX, UK
| | - Matthew Hartfield
- Institute of Ecology and Evolution, The University of Edinburgh, Edinburgh EH9 3FL, UK
| |
Collapse
|
3
|
Johri P, Eyre-Walker A, Gutenkunst RN, Lohmueller KE, Jensen JD. On the prospect of achieving accurate joint estimation of selection with population history. Genome Biol Evol 2022; 14:evac088. [PMID: 35675379 PMCID: PMC9254643 DOI: 10.1093/gbe/evac088] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 11/15/2022] Open
Abstract
As both natural selection and population history can affect genome-wide patterns of variation, disentangling the contributions of each has remained as a major challenge in population genetics. We here discuss historical and recent progress towards this goal-highlighting theoretical and computational challenges that remain to be addressed, as well as inherent difficulties in dealing with model complexity and model violations-and offer thoughts on potentially fruitful next steps.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | | | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
4
|
Böndel KB, Samuels T, Craig RJ, Ness RW, Colegrave N, Keightley PD. The distribution of fitness effects of spontaneous mutations in Chlamydomonas reinhardtii inferred using frequency changes under experimental evolution. PLoS Genet 2022; 18:e1009840. [PMID: 35704655 PMCID: PMC9239454 DOI: 10.1371/journal.pgen.1009840] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 06/28/2022] [Accepted: 04/13/2022] [Indexed: 12/23/2022] Open
Abstract
The distribution of fitness effects (DFE) for new mutations is fundamental for many aspects of population and quantitative genetics. In this study, we have inferred the DFE in the single-celled alga Chlamydomonas reinhardtii by estimating changes in the frequencies of 254 spontaneous mutations under experimental evolution and equating the frequency changes of linked mutations with their selection coefficients. We generated seven populations of recombinant haplotypes by crossing seven independently derived mutation accumulation lines carrying an average of 36 mutations in the haploid state to a mutation-free strain of the same genotype. We then allowed the populations to evolve under natural selection in the laboratory by serial transfer in liquid culture. We observed substantial and repeatable changes in the frequencies of many groups of linked mutations, and, surprisingly, as many mutations were observed to increase as decrease in frequency. Mutation frequencies were highly repeatable among replicates, suggesting that selection was the cause of the observed allele frequency changes. We developed a Bayesian Monte Carlo Markov Chain method to infer the DFE. This computes the likelihood of the observed distribution of changes of frequency, and obtains the posterior distribution of the selective effects of individual mutations, while assuming a two-sided gamma distribution of effects. We infer that the DFE is a highly leptokurtic distribution, and that approximately equal proportions of mutations have positive and negative effects on fitness. This result is consistent with what we have observed in previous work on a different C. reinhardtii strain, and suggests that a high fraction of new spontaneously arisen mutations are advantageous in a simple laboratory environment.
Collapse
Affiliation(s)
- Katharina B. Böndel
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Toby Samuels
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Rory J. Craig
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Rob W. Ness
- Department of Biology, William G. Davis Building, University of Toronto, Mississauga, Canada
| | - Nick Colegrave
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Peter D. Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
5
|
Abstract
African apes harbor at least twelve Plasmodium species, some of which have been a source of human infection. It is now well established that Plasmodium falciparum emerged following the transmission of a gorilla parasite, perhaps within the last 10,000 years, while Plasmodium vivax emerged earlier from a parasite lineage that infected humans and apes in Africa before the Duffy-negative mutation eliminated the parasite from humans there. Compared to their ape relatives, both human parasites have greatly reduced genetic diversity and an excess of nonsynonymous mutations, consistent with severe genetic bottlenecks followed by rapid population expansion. A putative new Plasmodium species widespread in chimpanzees, gorillas, and bonobos places the origin of Plasmodium malariae in Africa. Here, we review what is known about the origins and evolutionary history of all human-infective Plasmodium species, the time and circumstances of their emergence, and the diversity, host specificity, and zoonotic potential of their ape counterparts.
Collapse
Affiliation(s)
- Paul M Sharp
- Institute of Evolutionary Biology and Centre for Immunity, Infection and Evolution, University of Edinburgh, EH9 3FL, United Kingdom
| | - Lindsey J Plenderleith
- Institute of Evolutionary Biology and Centre for Immunity, Infection and Evolution, University of Edinburgh, EH9 3FL, United Kingdom
| | - Beatrice H Hahn
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
| |
Collapse
|
6
|
Chen J, Bataillon T, Glémin S, Lascoux M. Hunting for beneficial mutations: conditioning on SIFT scores when estimating the distribution of fitness effect of new mutations. Genome Biol Evol 2021; 14:6310736. [PMID: 34180988 PMCID: PMC8743036 DOI: 10.1093/gbe/evab151] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
The Distribution of Fitness Effects (DFE) of new mutations is a key parameter of molecular evolution. The DFE can in principle be estimated by comparing the Site Frequency Spectra (SFS) of putatively neutral and functional polymorphisms. Unfortunately the DFE is intrinsically hard to estimate, especially for beneficial mutations since these tend to be exceedingly rare. There is therefore a strong incentive to find out whether conditioning on properties of mutations that are independent of the SFS could provide additional information. In the present study, we developed a new measure based on SIFT scores. SIFT scores are assigned to nucleotide sites based on their level of conservation across a multi species alignment: the more conserved a site, the more likely mutations occurring at this site are deleterious and the lower the SIFT score. If one knows the ancestral state at a given site, one can assign a value to new mutations occurring at the site based on the change of SIFT score associated with the mutation. We called this new measure δ. We show that properties of the DFE as well as the flux of beneficial mutations across classes covary with δ and, hence, that SIFT scores are informative when estimating the fitness effect of new mutations. In particular, conditioning on SIFT scores can help to characterize beneficial mutations.
Collapse
Affiliation(s)
- J Chen
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, 310058, China
| | - T Bataillon
- Bioinformatics Research Centre, Aarhus University, C.F. Møllers Allé 8, Aarhus C, DK-8000, Denmark
| | - S Glémin
- Université de Rennes, Centre National de la Recherche Scientifique (CNRS), ECOBIO (Ecosystèmes, Biodiversité, Evolution) - Unité Mixte de Recherche (UMR) 6553, Rennes, F-35000, France.,Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| | - M Lascoux
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| |
Collapse
|
7
|
Naji MM, Utsunomiya YT, Sölkner J, Rosen BD, Mészáros G. Investigation of ancestral alleles in the Bovinae subfamily. BMC Genomics 2021; 22:108. [PMID: 33557747 PMCID: PMC7871596 DOI: 10.1186/s12864-021-07412-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 01/27/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND In evolutionary theory, divergence and speciation can arise from long periods of reproductive isolation, genetic mutation, selection and environmental adaptation. After divergence, alleles can either persist in their initial state (ancestral allele - AA), co-exist or be replaced by a mutated state (derived alleles -DA). In this study, we aligned whole genome sequences of individuals from the Bovinae subfamily to the cattle reference genome (ARS.UCD-1.2) for defining ancestral alleles necessary for selection signatures study. RESULTS Accommodating independent divergent of each lineage from the initial ancestral state, AA were defined based on fixed alleles on at least two groups of yak, bison and gayal-gaur-banteng resulting in ~ 32.4 million variants. Using non-overlapping scanning windows of 10 Kb, we counted the AA observed within taurine and zebu cattle. We focused on the extreme points, regions with top 0. 1% (high count) and regions without any occurrence of AA (null count). High count regions preserved gene functions from ancestral states that are still beneficial in the current condition, while null counts regions were linked to mutated ones. For both cattle, high count regions were associated with basal lipid metabolism, essential for survival of various environmental pressures. Mutated regions were associated to productive traits in taurine, i.e. higher metabolism, cell development and behaviors and in immune response domain for zebu. CONCLUSIONS Our findings suggest that retaining and losing AA in some regions are varied and made it species-specific with possibility of overlapping as it depends on the selective pressure they had to experience.
Collapse
Affiliation(s)
- Maulana M. Naji
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| | - Yuri T. Utsunomiya
- São Paulo State University (Unesp), School of Veterinary Medicine, Department of Production and Animal Health, Araçatuba, São Paulo Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics, Araçatuba, São Paulo Brazil
- AgroPartners Consulting. R. Floriano Peixoto, 120-Sala 43A-Centro, Araçatuba, SP 16010-220 Brazil
- Personal-PEC. R. Sebastiao Lima, 1336-Centro, Campo Grande, MS 79004-600 Brazil
| | - Johann Sölkner
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| | | | - Gábor Mészáros
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| |
Collapse
|
8
|
Abstract
Drosophila melanogaster, a small dipteran of African origin, represents one of the best-studied model organisms. Early work in this system has uniquely shed light on the basic principles of genetics and resulted in a versatile collection of genetic tools that allow to uncover mechanistic links between genotype and phenotype. Moreover, given its worldwide distribution in diverse habitats and its moderate genome-size, Drosophila has proven very powerful for population genetics inference and was one of the first eukaryotes whose genome was fully sequenced. In this book chapter, we provide a brief historical overview of research in Drosophila and then focus on recent advances during the genomic era. After describing different types and sources of genomic data, we discuss mechanisms of neutral evolution including the demographic history of Drosophila and the effects of recombination and biased gene conversion. Then, we review recent advances in detecting genome-wide signals of selection, such as soft and hard selective sweeps. We further provide a brief introduction to background selection, selection of noncoding DNA and codon usage and focus on the role of structural variants, such as transposable elements and chromosomal inversions, during the adaptive process. Finally, we discuss how genomic data helps to dissect neutral and adaptive evolutionary mechanisms that shape genetic and phenotypic variation in natural populations along environmental gradients. In summary, this book chapter serves as a starting point to Drosophila population genomics and provides an introduction to the system and an overview to data sources, important population genetic concepts and recent advances in the field.
Collapse
|
9
|
Charlesworth B. How Good Are Predictions of the Effects of Selective Sweeps on Levels of Neutral Diversity? Genetics 2020; 216:1217-1238. [PMID: 33106248 PMCID: PMC7768247 DOI: 10.1534/genetics.120.303734] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 10/22/2020] [Indexed: 11/18/2022] Open
Abstract
Selective sweeps are thought to play a significant role in shaping patterns of variability across genomes; accurate predictions of their effects are, therefore, important for understanding these patterns. A commonly used model of selective sweeps assumes that alleles sampled at the end of a sweep, and that fail to recombine with wild-type haplotypes during the sweep, coalesce instantaneously, leading to a simple expression for sweep effects on diversity. It is shown here that there can be a significant probability that a pair of alleles sampled at the end of a sweep coalesce during the sweep before a recombination event can occur, reducing their expected coalescent time below that given by the simple approximation. Expressions are derived for the expected reductions in pairwise neutral diversities caused by both single and recurrent sweeps in the presence of such within-sweep coalescence, although the effects of multiple recombination events during a sweep are only treated heuristically. The accuracies of the resulting expressions were checked against the results of simulations. For even moderate ratios of the recombination rate to the selection coefficient, the simple approximation can be substantially inaccurate. The selection model used here can be applied to favorable mutations with arbitrary dominance coefficients, to sex-linked loci with sex-specific selection coefficients, and to inbreeding populations. Using the results from this model, the expected differences between the levels of variability on X chromosomes and autosomes with selection at linked sites are discussed, and compared with data on a population of Drosophila melanogaster.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| |
Collapse
|
10
|
Zhen Y, Huber CD, Davies RW, Lohmueller KE. Greater strength of selection and higher proportion of beneficial amino acid changing mutations in humans compared with mice and Drosophila melanogaster. Genome Res 2020; 31:110-120. [PMID: 33208456 PMCID: PMC7849390 DOI: 10.1101/gr.256636.119] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 11/10/2020] [Indexed: 12/19/2022]
Abstract
Quantifying and comparing the amount of adaptive evolution among different species is key to understanding how evolution works. Previous studies have shown differences in adaptive evolution across species; however, their specific causes remain elusive. Here, we use improved modeling of weakly deleterious mutations and the demographic history of the outgroup species and ancestral population and estimate that at least 20% of nonsynonymous substitutions between humans and an outgroup species were fixed by positive selection. This estimate is much higher than previous estimates, which did not correct for the sizes of the outgroup species and ancestral population. Next, we jointly estimate the proportion and selection coefficient (p+ and s+, respectively) of newly arising beneficial nonsynonymous mutations in humans, mice, and Drosophila melanogaster by examining patterns of polymorphism and divergence. We develop a novel composite likelihood framework to test whether these parameters differ across species. Overall, we reject a model with the same p+ and s+ of beneficial mutations across species and estimate that humans have a higher p+s+ compared with that of D. melanogaster and mice. We show that this result cannot be caused by biased gene conversion or hypermutable CpG sites. We discuss possible biological explanations that could generate the observed differences in the amount of adaptive evolution across species.
Collapse
Affiliation(s)
- Ying Zhen
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095, USA.,Zhejiang Provincial Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China.,Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, 310024, China
| | - Christian D Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095, USA.,School of Biological Sciences, The University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Robert W Davies
- Program in Genetics and Genome Biology and The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, M5G 0A4, Canada.,Department of Statistics, University of Oxford, Oxford, OX1 3LB, United Kingdom
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095, USA.,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
11
|
Galtier N, Rousselle M. How Much Does Ne Vary Among Species? Genetics 2020; 216:559-572. [PMID: 32839240 PMCID: PMC7536855 DOI: 10.1534/genetics.120.303622] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 08/20/2020] [Indexed: 11/18/2022] Open
Abstract
Genetic drift is an important evolutionary force of strength inversely proportional to Ne , the effective population size. The impact of drift on genome diversity and evolution is known to vary among species, but quantifying this effect is a difficult task. Here we assess the magnitude of variation in drift power among species of animals via its effect on the mutation load - which implies also inferring the distribution of fitness effects of deleterious mutations. To this aim, we analyze the nonsynonymous (amino-acid changing) and synonymous (amino-acid conservative) allele frequency spectra in a large sample of metazoan species, with a focus on the primates vs. fruit flies contrast. We show that a Gamma model of the distribution of fitness effects is not suitable due to strong differences in estimated shape parameters among taxa, while adding a class of lethal mutations essentially solves the problem. Using the Gamma + lethal model and assuming that the mean deleterious effects of nonsynonymous mutations is shared among species, we estimate that the power of drift varies by a factor of at least 500 between large-Ne and small-Ne species of animals, i.e., an order of magnitude more than the among-species variation in genetic diversity. Our results are relevant to Lewontin's paradox while further questioning the meaning of the Ne parameter in population genomics.
Collapse
Affiliation(s)
- Nicolas Galtier
- Institute of Evolution Sciences of Montpellier (ISEM), CNRS, University of Montpellier, IRD, EPHE, 34095 Montpellier, France
| | - Marjolaine Rousselle
- Institute of Evolution Sciences of Montpellier (ISEM), CNRS, University of Montpellier, IRD, EPHE, 34095 Montpellier, France
- Bioinformatics Research Centre, Aarhus University, DK Aarhus, Denmark
| |
Collapse
|
12
|
Abstract
Natural highly fecund populations abound. These range from viruses to gadids. Many highly fecund populations are economically important. Highly fecund populations provide an important contrast to the low-fecundity organisms that have traditionally been applied in evolutionary studies. A key question regarding high fecundity is whether large numbers of offspring are produced on a regular basis, by few individuals each time, in a sweepstakes mode of reproduction. Such reproduction characteristics are not incorporated into the classical Wright-Fisher model, the standard reference model of population genetics, or similar types of models, in which each individual can produce only small numbers of offspring relative to the population size. The expected genomic footprints of population genetic models of sweepstakes reproduction are very different from those of the Wright-Fisher model. A key, immediate issue involves identifying the footprints of sweepstakes reproduction in genomic data. Whole-genome sequencing data can be used to distinguish the patterns made by sweepstakes reproduction from the patterns made by population growth in a population evolving according to the Wright-Fisher model (or similar models). If the hypothesis of sweepstakes reproduction cannot be rejected, then models of sweepstakes reproduction and associated multiple-merger coalescents will become at least as relevant as the Wright-Fisher model (or similar models) and the Kingman coalescent, the cornerstones of mathematical population genetics, in further discussions of evolutionary genomics of highly fecund populations.
Collapse
Affiliation(s)
- Bjarki Eldon
- Leibniz Institute for Evolution and Biodiversity Science, Museum für Naturkunde, D-10115 Berlin, Germany;
| |
Collapse
|
13
|
Booker TR. Inferring Parameters of the Distribution of Fitness Effects of New Mutations When Beneficial Mutations Are Strongly Advantageous and Rare. G3 (BETHESDA, MD.) 2020; 10:2317-2326. [PMID: 32371451 PMCID: PMC7341129 DOI: 10.1534/g3.120.401052] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 05/01/2020] [Indexed: 12/13/2022]
Abstract
Characterizing the distribution of fitness effects (DFE) for new mutations is central in evolutionary genetics. Analysis of molecular data under the McDonald-Kreitman test has suggested that adaptive substitutions make a substantial contribution to between-species divergence. Methods have been proposed to estimate the parameters of the distribution of fitness effects for positively selected mutations from the unfolded site frequency spectrum (uSFS). Such methods perform well when beneficial mutations are mildly selected and frequent. However, when beneficial mutations are strongly selected and rare, they may make little contribution to standing variation and will thus be difficult to detect from the uSFS. In this study, I analyze uSFS data from simulated populations subject to advantageous mutations with effects on fitness ranging from mildly to strongly beneficial. As expected, frequent, mildly beneficial mutations contribute substantially to standing genetic variation and parameters are accurately recovered from the uSFS. However, when advantageous mutations are strongly selected and rare, there are very few segregating in populations at any one time. Fitting the uSFS in such cases leads to underestimates of the strength of positive selection and may lead researchers to false conclusions regarding the relative contribution adaptive mutations make to molecular evolution. Fortunately, the parameters for the distribution of fitness effects for harmful mutations are estimated with high accuracy and precision. The results from this study suggest that the parameters of positively selected mutations obtained by analysis of the uSFS should be treated with caution and that variability at linked sites should be used in conjunction with standing variability to estimate parameters of the distribution of fitness effects in the future.
Collapse
Affiliation(s)
- Tom R Booker
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada and
- Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| |
Collapse
|
14
|
Rousselle M, Simion P, Tilak MK, Figuet E, Nabholz B, Galtier N. Is adaptation limited by mutation? A timescale-dependent effect of genetic diversity on the adaptive substitution rate in animals. PLoS Genet 2020; 16:e1008668. [PMID: 32251427 PMCID: PMC7162527 DOI: 10.1371/journal.pgen.1008668] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 04/16/2020] [Accepted: 02/14/2020] [Indexed: 12/16/2022] Open
Abstract
Whether adaptation is limited by the beneficial mutation supply is a long-standing question of evolutionary genetics, which is more generally related to the determination of the adaptive substitution rate and its relationship with species effective population size (Ne) and genetic diversity. Empirical evidence reported so far is equivocal, with some but not all studies supporting a higher adaptive substitution rate in large-Ne than in small-Ne species. We gathered coding sequence polymorphism data and estimated the adaptive amino-acid substitution rate ωa, in 50 species from ten distant groups of animals with markedly different population mutation rate θ. We reveal the existence of a complex, timescale dependent relationship between species adaptive substitution rate and genetic diversity. We find a positive relationship between ωa and θ among closely related species, indicating that adaptation is indeed limited by the mutation supply, but this was only true in relatively low-θ taxa. In contrast, we uncover no significant correlation between ωa and θ at a larger taxonomic scale, suggesting that the proportion of beneficial mutations scales negatively with species' long-term Ne.
Collapse
Affiliation(s)
| | - Paul Simion
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LEGE, Department of Biology, University of Namur, Namur, Belgium
| | - Marie-Ka Tilak
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Emeric Figuet
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Benoit Nabholz
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Nicolas Galtier
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
15
|
Abstract
The time taken for a selectively favorable allele to spread through a single population was investigated early in the history of population genetics. The resulting formulas are based on deterministic dynamics, leading to inaccuracies at allele frequencies close to 0 or 1. To remedy this problem, the properties of the stochastic phases at either end point of allele frequency need to be analyzed. This article uses a heuristic approach to determining the expected times spent in the stochastic and deterministic phases of allele frequency trajectories, for a model of weak selection at a single locus that is valid for inbreeding populations and for autosomal and sex-linked inheritance. The net fixation time is surprisingly insensitive to the level of dominance of a favorable mutation, even with random mating. Approximate expressions for the variance of the net fixation time are also obtained, which imply that there can be substantial stochastic effects even in very large populations. The accuracy of the approximations was evaluated by comparisons with computer simulations. The results reveal some areas that need further investigation if a full understanding of selective sweeps is to be obtained, notably the possibility that fixations of slightly deleterious mutations may be affecting variability at closely linked sites.
Collapse
|
16
|
Mugal CF, Kutschera VE, Botero-Castro F, Wolf JBW, Kaj I. Polymorphism Data Assist Estimation of the Nonsynonymous over Synonymous Fixation Rate Ratio ω for Closely Related Species. Mol Biol Evol 2020; 37:260-279. [PMID: 31504782 PMCID: PMC6984366 DOI: 10.1093/molbev/msz203] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The ratio of nonsynonymous over synonymous sequence divergence, dN/dS, is a widely used estimate of the nonsynonymous over synonymous fixation rate ratio ω, which measures the extent to which natural selection modulates protein sequence evolution. Its computation is based on a phylogenetic approach and computes sequence divergence of protein-coding DNA between species, traditionally using a single representative DNA sequence per species. This approach ignores the presence of polymorphisms and relies on the indirect assumption that new mutations fix instantaneously, an assumption which is generally violated and reasonable only for distantly related species. The violation of the underlying assumption leads to a time-dependence of sequence divergence, and biased estimates of ω in particular for closely related species, where the contribution of ancestral and lineage-specific polymorphisms to sequence divergence is substantial. We here use a time-dependent Poisson random field model to derive an analytical expression of dN/dS as a function of divergence time and sample size. We then extend our framework to the estimation of the proportion of adaptive protein evolution α. This mathematical treatment enables us to show that the joint usage of polymorphism and divergence data can assist the inference of selection for closely related species. Moreover, our analytical results provide the basis for a protocol for the estimation of ω and α for closely related species. We illustrate the performance of this protocol by studying a population data set of four corvid species, which involves the estimation of ω and α at different time-scales and for several choices of sample sizes.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Science for Life Laboratory, Stockholm University, Stockholm, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Fidel Botero-Castro
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Ingemar Kaj
- Department of Mathematics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
17
|
Tataru P, Bataillon T. polyDFE: Inferring the Distribution of Fitness Effects and Properties of Beneficial Mutations from Polymorphism Data. Methods Mol Biol 2020; 2090:125-146. [PMID: 31975166 DOI: 10.1007/978-1-0716-0199-0_6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The possible evolutionary trajectories a population can follow is determined by the fitness effects of new mutations. Their relative frequencies are best specified through a distribution of fitness effects (DFE) that spans deleterious, neutral, and beneficial mutations. As such, the DFE is key to several aspects of the evolution of a population, and particularly the rate of adaptive molecular evolution (α). Inference of DFE from patterns of polymorphism and divergence has been a longstanding goal of evolutionary genetics.polyDFE provides a flexible statistical framework to estimate the DFE and α from site frequency spectrum (SFS) data. Several probability distributions can be fitted to the data to model the DFE. The method also jointly estimates a series of nuisance parameters that model the effect of unknown demography as well data imperfections, in particular possible errors in polarizing SNPs. This chapter is organized as a tutorial for polyDFE. We start by briefly reviewing the concept of DFE, α, and the principles underlying the method, and then provide an example using central chimpanzees data (Tataru et al., Genetics 207(3):1103-1119, 2017; Bataillon et al., Genome Biol Evol 7(4):1122-1132, 2015) to guide the user through the different steps of an analysis: formatting the data as input to polyDFE, fitting different models, obtaining estimates of parameters uncertainty and performing statistical tests, as well as model averaging procedures to obtain robust estimates of model parameters.
Collapse
Affiliation(s)
- Paula Tataru
- Bioinformatics Research Center, Aarhus University, Aarhus, Denmark
| | - Thomas Bataillon
- Bioinformatics Research Center, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
18
|
Zhou Y, Minio A, Massonnet M, Solares E, Lv Y, Beridze T, Cantu D, Gaut BS. The population genetics of structural variants in grapevine domestication. NATURE PLANTS 2019; 5:965-979. [PMID: 31506640 DOI: 10.1038/s41477-019-0507-8] [Citation(s) in RCA: 150] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 07/26/2019] [Indexed: 05/20/2023]
Abstract
Structural variants (SVs) are a largely unexplored feature of plant genomes. Little is known about the type and size of SVs, their distribution among individuals and, especially, their population dynamics. Understanding these dynamics is critical for understanding both the contributions of SVs to phenotypes and the likelihood of identifying them as causal genetic variants in genome-wide associations. Here, we identify SVs and study their evolutionary genomics in clonally propagated grapevine cultivars and their outcrossing wild progenitors. To catalogue SVs, we assembled the highly heterozygous Chardonnay genome, for which one in seven genes is hemizygous based on SVs. Using an integrative comparison between Chardonnay and Cabernet Sauvignon genomes by whole-genome, long-read and short-read alignment, we extended SV detection to population samples. We found that strong purifying selection acts against SVs but particularly against inversion and translocation events. SVs nonetheless accrue as recessive heterozygotes in clonally propagated lineages. They also define outlier regions of genomic divergence between wild and cultivated grapevines, suggesting roles in domestication. Outlier regions include the sex-determination region and the berry colour locus, where independent large, complex inversions have driven convergent phenotypic evolution.
Collapse
Affiliation(s)
- Yongfeng Zhou
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA, USA
| | - Andrea Minio
- Department of Viticulture and Enology, UC Davis, Davis, CA, USA
| | | | - Edwin Solares
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA, USA
| | - Yuanda Lv
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA, USA
| | - Tengiz Beridze
- Institute of Molecular Genetics, Agricultural University of Georgia, Tbilisi, Georgia
| | - Dario Cantu
- Department of Viticulture and Enology, UC Davis, Davis, CA, USA.
| | - Brandon S Gaut
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA, USA.
| |
Collapse
|
19
|
Booker TR, Keightley PD. Understanding the Factors That Shape Patterns of Nucleotide Diversity in the House Mouse Genome. Mol Biol Evol 2019; 35:2971-2988. [PMID: 30295866 PMCID: PMC6278861 DOI: 10.1093/molbev/msy188] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A major goal of population genetics has been to determine the extent by which selection at linked sites influences patterns of neutral nucleotide diversity in the genome. Multiple lines of evidence suggest that diversity is influenced by both positive and negative selection. For example, in many species there are troughs in diversity surrounding functional genomic elements, consistent with the action of either background selection (BGS) or selective sweeps. In this study, we investigated the causes of the diversity troughs that are observed in the wild house mouse genome. Using the unfolded site frequency spectrum, we estimated the strength and frequencies of deleterious and advantageous mutations occurring in different functional elements in the genome. We then used these estimates to parameterize forward-in-time simulations of chromosomes, using realistic distributions of functional elements and recombination rate variation in order to determine whether selection at linked sites can explain the observed patterns of nucleotide diversity. The simulations suggest that BGS alone cannot explain the dips in diversity around either exons or conserved noncoding elements. A combination of BGS and selective sweeps produces deeper dips in diversity than BGS alone, but the inferred parameters of selection cannot fully explain the patterns observed in the genome. Our results provide evidence of sweeps shaping patterns of nucleotide diversity across the mouse genome and also suggest that infrequent, strongly advantageous mutations play an important role in this. The limitations of using the unfolded site frequency spectrum for inferring the frequency and effects of advantageous mutations are discussed.
Collapse
Affiliation(s)
- Tom R Booker
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom.,Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Peter D Keightley
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
20
|
López-Cortegano E, Caballero A. Inferring the Nature of Missing Heritability in Human Traits Using Data from the GWAS Catalog. Genetics 2019; 212:891-904. [PMID: 31123044 PMCID: PMC6614893 DOI: 10.1534/genetics.119.302077] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 05/11/2019] [Indexed: 02/07/2023] Open
Abstract
Thousands of genes responsible for many diseases and other common traits in humans have been detected by Genome Wide Association Studies (GWAS) in the last decade. However, candidate causal variants found so far usually explain only a small fraction of the heritability estimated by family data. The most common explanation for this observation is that the missing heritability corresponds to variants, either rare or common, with very small effect, which pass undetected due to a lack of statistical power. We carried out a meta-analysis using data from the NHGRI-EBI GWAS Catalog in order to explore the observed distribution of locus effects for a set of 42 complex traits and to quantify their contribution to narrow-sense heritability. With the data at hand, we were able to predict the expected distribution of locus effects for 16 traits and diseases, their expected contribution to heritability, and the missing number of loci yet to be discovered to fully explain the familial heritability estimates. Our results indicate that, for 6 out of the 16 traits, the additive contribution of a great number of loci is unable to explain the familial (broad-sense) heritability, suggesting that the gap between GWAS and familial estimates of heritability may not ever be closed for these traits. In contrast, for the other 10 traits, the additive contribution of hundreds or thousands of loci yet to be found could potentially explain the familial heritability estimates, if this were the case. Computer simulations are used to illustrate the possible contribution from nonadditive genetic effects to the gap between GWAS and familial estimates of heritability.
Collapse
Affiliation(s)
| | - Armando Caballero
- Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310, Spain
| |
Collapse
|
21
|
Bergman J, Eyre-Walker A. Does Adaptive Protein Evolution Proceed by Large or Small Steps at the Amino Acid Level? Mol Biol Evol 2019; 36:990-998. [PMID: 30903659 DOI: 10.1093/molbev/msz033] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A long-standing question in evolutionary biology is the relative contribution of large and small effect mutations to the adaptive process. We have investigated this question in proteins by estimating the rate of adaptive evolution between all pairs of amino acids separated by one mutational step using a McDonald-Kreitman type approach and genome-wide data from several Drosophila species. We find that the rate of adaptive evolution is highest among amino acids that are more similar. This is partly due to the fact that the proportion of mutations that are adaptive is higher among more similar amino acids. We also find that the rate of neutral evolution between amino acids is higher among more similar amino acids. Overall our results suggest that both the adaptive and nonadaptive evolution of proteins are dominated by substitutions between similar amino acids.
Collapse
Affiliation(s)
- Juraj Bergman
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Vienna Graduate School of Population Genetics, Wien, Austria
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
22
|
Guirao-Rico S, González J. Evolutionary insights from large scale resequencing datasets in Drosophila melanogaster. CURRENT OPINION IN INSECT SCIENCE 2019; 31:70-76. [PMID: 31109676 DOI: 10.1016/j.cois.2018.11.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 11/04/2018] [Accepted: 11/06/2018] [Indexed: 06/09/2023]
Abstract
Drosophila melanogaster has long been used as an evolutionary model system. Its small genome size, well-annotated genome, and ease of sampling, also makes it a choice species for genome resequencing studies. Hundreds of genomic samples from populations worldwide are available and are currently being used to tackle a wide range of evolutionary questions. In this review, we focused on three insights that have increased our understanding of the evolutionary history of this species, and that have implications for the study of evolutionary processes in other species as well. Because of technical limitations, most of the studies so far have focused on SNP variants. However, long-read sequencing techniques should allow us in the near future to include other type of genomic variants that also influence genome evolution.
Collapse
Affiliation(s)
- Sara Guirao-Rico
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain.
| |
Collapse
|
23
|
Sun Y, Abbott RJ, Lu Z, Mao K, Zhang L, Wang X, Ru D, Liu J. Reticulate evolution within a spruce (
Picea
) species complex revealed by population genomic analysis. Evolution 2018; 72:2669-2681. [DOI: 10.1111/evo.13624] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 10/05/2018] [Indexed: 12/24/2022]
Affiliation(s)
- Yongshuai Sun
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical GardenChinese Academy of Sciences Mengla 666303 P. R. China
| | - Richard J. Abbott
- School of BiologyUniversity of St Andrews St Andrews Fife KY16 9TH United Kingdom
| | - Zhiqiang Lu
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical GardenChinese Academy of Sciences Mengla 666303 P. R. China
| | - Kangshan Mao
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
| | - Lei Zhang
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
| | - Xiaojuan Wang
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
| | - Dafu Ru
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
| | - Jianquan Liu
- Key Laboratory for Bio‐resource and Eco‐environment of Ministry of Education, College of Life SciencesSichuan University Chengdu 610065 P. R. China
- State Key Laboratory of Grassland Agro‐Ecosystem, Institute of Innovation Ecology & College of Life ScienceLanzhou University Lanzhou 730000 Gansu P. R. China
| |
Collapse
|
24
|
Savisaar R, Hurst LD. Exonic splice regulation imposes strong selection at synonymous sites. Genome Res 2018; 28:1442-1454. [PMID: 30143596 PMCID: PMC6169883 DOI: 10.1101/gr.233999.117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 07/31/2018] [Indexed: 01/17/2023]
Abstract
What proportion of coding sequence nucleotides have roles in splicing, and how strong is the selection that maintains them? Despite a large body of research into exonic splice regulatory signals, these questions have not been answered. This is because, to our knowledge, previous investigations have not explicitly disentangled the frequency of splice regulatory elements from the strength of the evolutionary constraint under which they evolve. Current data are consistent both with a scenario of weak and diffuse constraint, enveloping large swaths of sequence, as well as with well-defined pockets of strong purifying selection. In the former case, natural selection on exonic splice enhancers (ESEs) might primarily act as a slight modifier of codon usage bias. In the latter, mutations that disrupt ESEs are likely to have large fitness and, potentially, clinical effects. To distinguish between these scenarios, we used several different methods to determine the distribution of selection coefficients for new mutations within ESEs. The analyses converged to suggest that ∼15%-20% of fourfold degenerate sites are part of functional ESEs. Most of these sites are under strong evolutionary constraint. Therefore, exonic splice regulation does not simply impose a weak bias that gently nudges coding sequence evolution in a particular direction. Rather, the selection to preserve these motifs is a strong force that severely constrains the evolution of a substantial proportion of coding nucleotides. Thus synonymous mutations that disrupt ESEs should be considered as a potentially common cause of single-locus genetic disorders.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
| |
Collapse
|
25
|
Loy DE, Plenderleith LJ, Sundararaman SA, Liu W, Gruszczyk J, Chen YJ, Trimboli S, Learn GH, MacLean OA, Morgan ALK, Li Y, Avitto AN, Giles J, Calvignac-Spencer S, Sachse A, Leendertz FH, Speede S, Ayouba A, Peeters M, Rayner JC, Tham WH, Sharp PM, Hahn BH. Evolutionary history of human Plasmodium vivax revealed by genome-wide analyses of related ape parasites. Proc Natl Acad Sci U S A 2018; 115:E8450-E8459. [PMID: 30127015 PMCID: PMC6130405 DOI: 10.1073/pnas.1810053115] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Wild-living African apes are endemically infected with parasites that are closely related to human Plasmodium vivax, a leading cause of malaria outside Africa. This finding suggests that the origin of P. vivax was in Africa, even though the parasite is now rare in humans there. To elucidate the emergence of human P. vivax and its relationship to the ape parasites, we analyzed genome sequence data of P. vivax strains infecting six chimpanzees and one gorilla from Cameroon, Gabon, and Côte d'Ivoire. We found that ape and human parasites share nearly identical core genomes, differing by only 2% of coding sequences. However, compared with the ape parasites, human strains of P. vivax exhibit about 10-fold less diversity and have a relative excess of nonsynonymous nucleotide polymorphisms, with site-frequency spectra suggesting they are subject to greatly relaxed purifying selection. These data suggest that human P. vivax has undergone an extreme bottleneck, followed by rapid population expansion. Investigating potential host-specificity determinants, we found that ape P. vivax parasites encode intact orthologs of three reticulocyte-binding protein genes (rbp2d, rbp2e, and rbp3), which are pseudogenes in all human P. vivax strains. However, binding studies of recombinant RBP2e and RBP3 proteins to human, chimpanzee, and gorilla erythrocytes revealed no evidence of host-specific barriers to red blood cell invasion. These data suggest that, from an ancient stock of P. vivax parasites capable of infecting both humans and apes, a severely bottlenecked lineage emerged out of Africa and underwent rapid population growth as it spread globally.
Collapse
Affiliation(s)
- Dorothy E Loy
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Department of Microbiology, University of Pennsylvania, Philadelphia, PA 19104
| | - Lindsey J Plenderleith
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Sesh A Sundararaman
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Department of Microbiology, University of Pennsylvania, Philadelphia, PA 19104
| | - Weimin Liu
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Jakub Gruszczyk
- Walter and Eliza Hall Institute of Medical Research, Parkville VIC 3052, Australia
| | - Yi-Jun Chen
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
- Department of Medical Biology, The University of Melbourne, Parkville VIC 3010, Australia
| | - Stephanie Trimboli
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Gerald H Learn
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Oscar A MacLean
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Alex L K Morgan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Yingying Li
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Alexa N Avitto
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Jasmin Giles
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | | | | | | | - Sheri Speede
- Sanaga-Yong Chimpanzee Rescue Center, International Development Association-Africa, Portland, OR 97208
| | - Ahidjo Ayouba
- Recherche Translationnelle Appliquée au VIH et aux Maladies Infectieuses, Institut de Recherche pour le Développement, University of Montpellier, INSERM, 34090 Montpellier, France
| | - Martine Peeters
- Recherche Translationnelle Appliquée au VIH et aux Maladies Infectieuses, Institut de Recherche pour le Développement, University of Montpellier, INSERM, 34090 Montpellier, France
| | - Julian C Rayner
- Malaria Programme, Wellcome Trust Sanger Institute, Genome Campus, Hinxton Cambridgeshire CB10 1SA, United Kingdom
| | - Wai-Hong Tham
- Walter and Eliza Hall Institute of Medical Research, Parkville VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Parkville VIC 3010, Australia
| | - Paul M Sharp
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Beatrice H Hahn
- Department of Medicine, University of Pennsylvania, Philadelphia, PA 19104;
- Department of Microbiology, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
26
|
Patel R, Scheinfeldt LB, Sanderford MD, Lanham TR, Tamura K, Platt A, Glicksberg BS, Xu K, Dudley JT, Kumar S. Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 2018; 35:2015-2025. [PMID: 29846678 PMCID: PMC6063297 DOI: 10.1093/molbev/msy107] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The human genome contains hundreds of thousands of missense mutations. However, only a handful of these variants are known to be adaptive, which implies that adaptation through protein sequence change is an extremely rare phenomenon in human evolution. Alternatively, existing methods may lack the power to pinpoint adaptive variation. We have developed and applied an Evolutionary Probability Approach (EPA) to discover candidate adaptive polymorphisms (CAPs) through the discordance between allelic evolutionary probabilities and their observed frequencies in human populations. EPA reveals thousands of missense CAPs, which suggest that a large number of previously optimal alleles experienced a reversal of fortune in the human lineage. We explored nonadaptive mechanisms to explain CAPs, including the effects of demography, mutation rate variability, and negative and positive selective pressures in modern humans. Many nonadaptive hypotheses were tested, but failed to explain the data, which suggests that a large proportion of CAP alleles have increased in frequency due to beneficial selection. This suggestion is supported by the fact that a vast majority of adaptive missense variants discovered previously in humans are CAPs, and hundreds of CAP alleles are protective in genotype-phenotype association data. Our integrated phylogenomic and population genetic EPA approach predicts the existence of thousands of nonneutral candidate variants in the human proteome. We expect this collection to be enriched in beneficial variation. The EPA approach can be applied to discover candidate adaptive variation in any protein, population, or species for which allele frequency data and reliable multispecies alignments are available.
Collapse
Affiliation(s)
- Ravi Patel
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
| | - Laura B Scheinfeldt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Coriell Institute for Medical Research, Camden, NJ
| | - Maxwell D Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Tamera R Lanham
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Koichiro Tamura
- Department of Biology, Tokyo Metropolitan University, Tokyo, Japan
| | - Alexander Platt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| | - Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Ke Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
27
|
Zhang H, Dou S, He F, Luo J, Wei L, Lu J. Genome-wide maps of ribosomal occupancy provide insights into adaptive evolution and regulatory roles of uORFs during Drosophila development. PLoS Biol 2018; 16:e2003903. [PMID: 30028832 PMCID: PMC6070289 DOI: 10.1371/journal.pbio.2003903] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 08/01/2018] [Accepted: 07/03/2018] [Indexed: 11/19/2022] Open
Abstract
Upstream open reading frames (uORFs) play important roles in regulating the main coding DNA sequences (CDSs) via translational repression. Despite their prevalence in the genomes, uORFs are overall discriminated against by natural selection. However, it remains unclear why in the genomes there are so many uORFs more conserved than expected under the assumption of neutral evolution. Here, we generated genome-wide maps of translational efficiency (TE) at the codon level throughout the life cycle of Drosophila melanogaster. We identified 35,735 uORFs that were expressed, and 32,224 (90.2%) of them showed evidence of ribosome occupancy during Drosophila development. The ribosome occupancy of uORFs is determined by genomic features, such as optimized sequence contexts around their start codons, a shorter distance to CDSs, and higher coding potentials. Our population genomic analysis suggests the segregating mutations that create or disrupt uORFs are overall deleterious in D. melanogaster. However, we found for the first time that many (68.3% of) newly fixed uORFs that are associated with ribosomes in D. melanogaster are driven by positive Darwinian selection. Our findings also suggest that uORFs play a vital role in controlling the translational program in Drosophila. Moreover, we found that many uORFs are transcribed or translated in a developmental stage-, sex-, or tissue-specific manner, suggesting that selective transcription or translation of uORFs could potentially modulate the TE of the downstream CDSs during Drosophila development. Upstream open reading frames (uORFs) in the 5′ untranslated regions (UTRs) of messenger RNAs can potentially inhibit translation of the downstream regions that encode proteins by sequestering protein-making machinery the ribosome. Moreover, mutations that destroy existing uORFs or create new ones are known to cause human disease. Although mutations that create new uORFs are generally deleterious and are selected against, many uORFs are evolutionarily conserved across eukaryotic species. To resolve this dilemma, we used extensive mRNA-Seq and ribosome profiling to generate high-resolution genome-wide maps of ribosome occupancy and translational efficiency (TE) during the life cycle of the fruit fly D. melanogaster. This allowed us to identify the sequence features of uORFs that influence their ability to associate with ribosomes. We demonstrate for the first time that the majority of the newly fixed uORFs in D. melanogaster, especially the translated ones, are under positive Darwinian selection. We also show that uORFs exert widespread repressive effects on the translation of the downstream protein-coding region. We find that many uORFs are transcribed or translated in a developmental stage-, sex-, or tissue-specific manner. Our results suggest that during Drosophila development, changes in the TE of uORFs, as well as the inclusion/exclusion of uORFs, are frequently exploited to inversely influence the translation of the downstream protein-coding regions. Our study provides novel insights into the molecular mechanisms and functional consequences of uORF-mediated regulation.
Collapse
Affiliation(s)
- Hong Zhang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Shengqian Dou
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Feng He
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Junjie Luo
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Liping Wei
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
- * E-mail:
| |
Collapse
|
28
|
Inferring the Probability of the Derived vs. the Ancestral Allelic State at a Polymorphic Site. Genetics 2018; 209:897-906. [PMID: 29769282 PMCID: PMC6028244 DOI: 10.1534/genetics.118.301120] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 05/14/2018] [Indexed: 12/03/2022] Open
Abstract
It is known that the allele ancestral to the variation at a polymorphic site cannot be assigned with certainty, and that the most frequently used method to assign the ancestral state—maximum parsimony—is prone to misinference. Estimates of counts of sites that have a certain number of copies of the derived allele in a sample (the unfolded site frequency spectrum, uSFS) made by parsimony are therefore also biased. We previously developed a maximum likelihood method to estimate the uSFS for a focal species using information from two outgroups while assuming simple models of nucleotide substitution. Here, we extend this approach to allow multiple outgroups (implemented for three outgroups), potentially any phylogenetic tree topology, and more complex models of nucleotide substitution. We find, however, that two outgroups and the Kimura two-parameter model are adequate for uSFS inference in most cases. We show that using parsimony to infer the ancestral state at a specific site seriously breaks down in two situations. The first is where the outgroups provide no information about the ancestral state of variation in the focal species. In this case, nucleotide variation will be underestimated if such sites are excluded. The second is where the minor allele in the focal species agrees with the allelic state of the outgroups. In this situation, parsimony tends to overestimate the probability of the major allele being derived, because it fails to account for the fact that sites with a high frequency of the derived allele tend to be rare. We present a method that corrects this deficiency and is capable of providing nearly unbiased estimates of ancestral state probabilities on a site-by-site basis and the uSFS.
Collapse
|
29
|
Distinguishing Among Evolutionary Forces Acting on Genome-Wide Base Composition: Computer Simulation Analysis of Approximate Methods for Inferring Site Frequency Spectra of Derived Mutations. G3-GENES GENOMES GENETICS 2018; 8:1755-1769. [PMID: 29588382 PMCID: PMC5940166 DOI: 10.1534/g3.117.300512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Inferred ancestral nucleotide states are increasingly employed in analyses of within- and between -species genome variation. Although numerous studies have focused on ancestral inference among distantly related lineages, approaches to infer ancestral states in polymorphism data have received less attention. Recently developed approaches that employ complex transition matrices allow us to infer ancestral nucleotide sequence in various evolutionary scenarios of base composition. However, the requirement of a single gene tree to calculate a likelihood is an important limitation for conducting ancestral inference using within-species variation in recombining genomes. To resolve this problem, and to extend the applicability of ancestral inference in studies of base composition evolution, we first evaluate three previously proposed methods to infer ancestral nucleotide sequences among within- and between-species sequence variation data. The methods employ a single allele, bifurcating tree, or a star tree for within-species variation data. Using simulated nucleotide sequences, we employ ancestral inference to infer fixations and polymorphisms. We find that all three methods show biased inference. We modify the bifurcating tree method to include weights to adjust for an expected site frequency spectrum, “bifurcating tree with weighting” (BTW). Our simulation analysis show that the BTW method can substantially improve the reliability and robustness of ancestral inference in a range of scenarios that include non-neutral and/or non-stationary base composition evolution.
Collapse
|
30
|
Warner MR, Mikheyev AS, Linksvayer TA. Genomic Signature of Kin Selection in an Ant with Obligately Sterile Workers. Mol Biol Evol 2017; 34:1780-1787. [PMID: 28419349 PMCID: PMC5455959 DOI: 10.1093/molbev/msx123] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Kin selection is thought to drive the evolution of cooperation and conflict, but the specific genes and genome-wide patterns shaped by kin selection are unknown. We identified thousands of genes associated with the sterile ant worker caste, the archetype of an altruistic phenotype shaped by kin selection, and then used population and comparative genomic approaches to study patterns of molecular evolution at these genes. Consistent with population genetic theoretical predictions, worker-upregulated genes experienced reduced selection compared with genes upregulated in reproductive castes. Worker-upregulated genes included more taxonomically restricted genes, indicating that the worker caste has recruited more novel genes, yet these genes also experienced reduced selection. Our study identifies a putative genomic signature of kin selection and helps to integrate emerging sociogenomic data with longstanding social evolution theory.
Collapse
Affiliation(s)
- Michael R Warner
- Department of Biology, University of Pennsylvania, Philadelphia, PA
| | - Alexander S Mikheyev
- Ecology and Evolution Unit, Okinawa Institute of Science and Technology, Onna-son, Okinawa, Japan
| | | |
Collapse
|
31
|
Abstract
Population geneticists have long sought to understand the contribution of natural selection to molecular evolution. A variety of approaches have been proposed that use population genetics theory to quantify the rate and strength of positive selection acting in a species’ genome. In this review we discuss methods that use patterns of between-species nucleotide divergence and within-species diversity to estimate positive selection parameters from population genomic data. We also discuss recently proposed methods to detect positive selection from a population’s haplotype structure. The application of these tests has resulted in the detection of pervasive adaptive molecular evolution in multiple species.
Collapse
|
32
|
Inference of Distribution of Fitness Effects and Proportion of Adaptive Substitutions from Polymorphism Data. Genetics 2017; 207:1103-1119. [PMID: 28951530 PMCID: PMC5676230 DOI: 10.1534/genetics.117.300323] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 09/13/2017] [Indexed: 11/18/2022] Open
Abstract
The distribution of fitness effects (DFE) encompasses the fraction of deleterious, neutral, and beneficial mutations. It conditions the evolutionary trajectory of populations, as well as the rate of adaptive molecular evolution (α). Inferring DFE and α from patterns of polymorphism, as given through the site frequency spectrum (SFS) and divergence data, has been a longstanding goal of evolutionary genetics. A widespread assumption shared by previous inference methods is that beneficial mutations only contribute negligibly to the polymorphism data. Hence, a DFE comprising only deleterious mutations tends to be estimated from SFS data, and α is then predicted by contrasting the SFS with divergence data from an outgroup. We develop a hierarchical probabilistic framework that extends previous methods to infer DFE and α from polymorphism data alone. We use extensive simulations to examine the performance of our method. While an outgroup is still needed to obtain an unfolded SFS, we show that both a DFE, comprising both deleterious and beneficial mutations, and α can be inferred without using divergence data. We also show that not accounting for the contribution of beneficial mutations to polymorphism data leads to substantially biased estimates of the DFE and α. We compare our framework with one of the most widely used inference methods available and apply it on a recently published chimpanzee exome data set.
Collapse
|
33
|
Estimating the parameters of background selection and selective sweeps in Drosophila in the presence of gene conversion. Proc Natl Acad Sci U S A 2017; 114:E4762-E4771. [PMID: 28559322 DOI: 10.1073/pnas.1619434114] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
We used whole-genome resequencing data from a population of Drosophila melanogaster to investigate the causes of the negative correlation between the within-population synonymous nucleotide site diversity (πS ) of a gene and its degree of divergence from related species at nonsynonymous nucleotide sites (KA ). By using the estimated distributions of mutational effects on fitness at nonsynonymous and UTR sites, we predicted the effects of background selection at sites within a gene on πS and found that these could account for only part of the observed correlation between πS and KA We developed a model of the effects of selective sweeps that included gene conversion as well as crossing over. We used this model to estimate the average strength of selection on positively selected mutations in coding sequences and in UTRs, as well as the proportions of new mutations that are selectively advantageous. Genes with high levels of selective constraint on nonsynonymous sites were found to have lower strengths of positive selection and lower proportions of advantageous mutations than genes with low levels of constraint. Overall, background selection and selective sweeps within a typical gene reduce its synonymous diversity to ∼75% of its value in the absence of selection, with larger reductions for genes with high KA Gene conversion has a major effect on the estimates of the parameters of positive selection, such that the estimated strength of selection on favorable mutations is greatly reduced if it is ignored.
Collapse
|
34
|
Hubby and Lewontin on Protein Variation in Natural Populations: When Molecular Genetics Came to the Rescue of Population Genetics. Genetics 2017; 203:1497-503. [PMID: 27516612 DOI: 10.1534/genetics.115.185975] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The 1966 GENETICS papers by John Hubby and Richard Lewontin were a landmark in the study of genome-wide levels of variability. They used the technique of gel electrophoresis of enzymes and proteins to study variation in natural populations of Drosophila pseudoobscura, at a set of loci that had been chosen purely for technical convenience, without prior knowledge of their levels of variability. Together with the independent study of human populations by Harry Harris, this seminal study provided the first relatively unbiased picture of the extent of genetic variability in protein sequences within populations, revealing that many genes had surprisingly high levels of diversity. These papers stimulated a large research program that found similarly high electrophoretic variability in many different species and led to statistical tools for interpreting the data in terms of population genetics processes such as genetic drift, balancing and purifying selection, and the effects of selection on linked variants. The current use of whole-genome sequences in studies of variation is the direct descendant of this pioneering work.
Collapse
|
35
|
Abstract
Molecular population genetics aims to explain genetic variation and molecular evolution from population genetics principles. The field was born 50 years ago with the first measures of genetic variation in allozyme loci, continued with the nucleotide sequencing era, and is currently in the era of population genomics. During this period, molecular population genetics has been revolutionized by progress in data acquisition and theoretical developments. The conceptual elegance of the neutral theory of molecular evolution or the footprint carved by natural selection on the patterns of genetic variation are two examples of the vast number of inspiring findings of population genetics research. Since the inception of the field, Drosophila has been the prominent model species: molecular variation in populations was first described in Drosophila and most of the population genetics hypotheses were tested in Drosophila species. In this review, we describe the main concepts, methods, and landmarks of molecular population genetics, using the Drosophila model as a reference. We describe the different genetic data sets made available by advances in molecular technologies, and the theoretical developments fostered by these data. Finally, we review the results and new insights provided by the population genomics approach, and conclude by enumerating challenges and new lines of inquiry posed by increasingly large population scale sequence data.
Collapse
|
36
|
Charlesworth et al. on Background Selection and Neutral Diversity. Genetics 2017; 204:829-832. [PMID: 28114095 DOI: 10.1534/genetics.116.196170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
37
|
Abstract
Using data from 83 isolates from a single population, the population genomics of the microcrustacean Daphnia pulex are described and compared to current knowledge for the only other well-studied invertebrate, Drosophila melanogaster These two species are quite similar with respect to effective population sizes and mutation rates, although some features of recombination appear to be different, with linkage disequilibrium being elevated at short ([Formula: see text] bp) distances in D. melanogaster and at long distances in D. pulex The study population adheres closely to the expectations under Hardy-Weinberg equilibrium, and reflects a past population history of no more than a twofold range of variation in effective population size. Fourfold redundant silent sites and a restricted region of intronic sites appear to evolve in a nearly neutral fashion, providing a powerful tool for population genetic analyses. Amino acid replacement sites are predominantly under strong purifying selection, as are a large fraction of sites in UTRs and intergenic regions, but the majority of SNPs at such sites that rise to frequencies [Formula: see text] appear to evolve in a nearly neutral fashion. All forms of genomic sites (including replacement sites within codons, and intergenic and UTR regions) appear to be experiencing an [Formula: see text] higher level of selection scaled to the power of drift in D. melanogaster, but this may in part be a consequence of recent demographic changes. These results establish D. pulex as an excellent system for future work on the evolutionary genomics of natural populations.
Collapse
|