1
|
Marsh JI, Johri P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection. Mol Biol Evol 2024; 41:msae118. [PMID: 38874402 PMCID: PMC11245712 DOI: 10.1093/molbev/msae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
Collapse
Affiliation(s)
- Jacob I Marsh
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Parul Johri
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
2
|
On the effects of selection and mutation on species tree inference. Mol Phylogenet Evol 2023; 179:107650. [PMID: 36441104 DOI: 10.1016/j.ympev.2022.107650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 10/17/2022] [Accepted: 10/18/2022] [Indexed: 11/24/2022]
Abstract
The effect of selection acting on regions of the genome on the accuracy of species-level phylogenetic inference using methods that do not explicitly model selection is an open question that is relevant to most, if not all, phylogenomic studies. To address this, we derive a mathematical approximation to the Wright-Fisher model with mutation and selection in the limit as the population size becomes large. In contrast to previous approximations based on diffusion processes, our approximation can be used to study the distribution of coalescent times for an arbitrary number of lineages, allowing calculation of the probability distribution of gene genealogies under the coalescent model. We use these calculations to show that direct selection at strengths typically encountered in practice has only a small effect on the distribution of coalescent times, and hence on the distribution of gene trees. This implies that many coalescent-based methods for estimating the species tree topology will be robust to the presence of selection in a subset of the underlying genes. Selection will, however, bias the estimation of speciation times, causing them to underestimate the true speciation times. Our model captures the effects of selection on the genealogies that generate the observed sequence data, but does not model selective pressures that act only on the subsequent sequences or that negatively impact gene tree estimation.
Collapse
|
3
|
Johri P, Eyre-Walker A, Gutenkunst RN, Lohmueller KE, Jensen JD. On the prospect of achieving accurate joint estimation of selection with population history. Genome Biol Evol 2022; 14:6604401. [PMID: 35675379 PMCID: PMC9254643 DOI: 10.1093/gbe/evac088] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 11/15/2022] Open
Abstract
As both natural selection and population history can affect genome-wide patterns of variation, disentangling the contributions of each has remained as a major challenge in population genetics. We here discuss historical and recent progress towards this goal—highlighting theoretical and computational challenges that remain to be addressed, as well as inherent difficulties in dealing with model complexity and model violations—and offer thoughts on potentially fruitful next steps.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | | | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
4
|
Boitard S, Arredondo A, Chikhi L, Mazet O. Heterogeneity in effective size across the genome: effects on the inverse instantaneous coalescence rate (IICR) and implications for demographic inference under linked selection. Genetics 2022; 220:6512058. [PMID: 35100421 PMCID: PMC8893248 DOI: 10.1093/genetics/iyac008] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 01/01/2022] [Indexed: 01/22/2023] Open
Abstract
The relative contribution of selection and neutrality in shaping species genetic diversity is one of the most central and controversial questions in evolutionary theory. Genomic data provide growing evidence that linked selection, i.e. the modification of genetic diversity at neutral sites through linkage with selected sites, might be pervasive over the genome. Several studies proposed that linked selection could be modeled as first approximation by a local reduction (e.g. purifying selection, selective sweeps) or increase (e.g. balancing selection) of effective population size (Ne). At the genome-wide scale, this leads to variations of Ne from one region to another, reflecting the heterogeneity of selective constraints and recombination rates between regions. We investigate here the consequences of such genomic variations of Ne on the genome-wide distribution of coalescence times. The underlying motivation concerns the impact of linked selection on demographic inference, because the distribution of coalescence times is at the heart of several important demographic inference approaches. Using the concept of inverse instantaneous coalescence rate, we demonstrate that in a panmictic population, linked selection always results in a spurious apparent decrease of Ne along time. Balancing selection has a particularly large effect, even when it concerns a very small part of the genome. We also study more general models including genuine population size changes, population structure or transient selection and find that the effect of linked selection can be significantly reduced by that of population structure. The models and conclusions presented here are also relevant to the study of other biological processes generating apparent variations of Ne along the genome.
Collapse
Affiliation(s)
- Simon Boitard
- CBGP, Université de Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montferrier-sur-Lez 34988, France
- Corresponding author: Université de Montpellier, CIRAD, INRAE, Institut Agro, IRD, 755 Avenue du Campus Agropolis, CS 30016, Montferrier-sur-Lez 34988, France.
| | - Armando Arredondo
- Institut National des Sciences Appliquées, Institut de Mathématiques de Toulouse, Université de Toulouse,Toulouse 31062, France
| | - Lounès Chikhi
- Instituto Gulbenkian de Ciência, Oeiras P-2780-156, Portugal
- Laboratoire Évolution & Diversité Biologique (EDB UMR 5174), CNRS, IRD, UPS, Université de Toulouse Midi-Pyrénées, Toulouse 31062, France
| | - Olivier Mazet
- Institut National des Sciences Appliquées, Institut de Mathématiques de Toulouse, Université de Toulouse,Toulouse 31062, France
| |
Collapse
|
5
|
Lucek K, Willi Y. Drivers of linkage disequilibrium across a species' geographic range. PLoS Genet 2021; 17:e1009477. [PMID: 33770075 PMCID: PMC8026057 DOI: 10.1371/journal.pgen.1009477] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 04/07/2021] [Accepted: 03/09/2021] [Indexed: 11/25/2022] Open
Abstract
While linkage disequilibrium (LD) is an important parameter in genetics and evolutionary biology, the drivers of LD remain elusive. Using whole-genome sequences from across a species’ range, we assessed the impact of demographic history and mating system on LD. Both range expansion and a shift from outcrossing to selfing in North American Arabidopsis lyrata were associated with increased average genome-wide LD. Our results indicate that range expansion increases short-distance LD at the farthest range edges by about the same amount as a shift to selfing. However, the extent over which LD in genic regions unfolds was shorter for range expansion compared to selfing. Linkage among putatively neutral variants and between neutral and deleterious variants increased to a similar degree with range expansion, providing support that genome-wide LD was positively associated with mutational load. As a consequence, LD combined with mutational load may decelerate range expansions and set range limits. Finally, a small number of genes were identified as LD outliers, suggesting that they experience selection by either of the two demographic processes. These included genes involved in flowering and photoperiod for range expansion, and the self-incompatibility locus for mating system. Nearby genomic variants are often co-inherited because of limited recombination. The extent of non-random association of alleles at different loci is called linkage disequilibrium (LD) and is commonly used in genomic analyses, for example to detect regions under selection or to determine effective population size. Here we reversed testing and addressed how demographic history may affect LD within a species. Using genomic data from more than a thousand individuals of North American Arabidopsis lyrata from across the entire species’ range, we quantified the effect of postglacial range expansion and a shift in mating system from outcrossing to selfing on LD. We show that both factors lead to increased LD, and that the maximal effect of range expansion is comparable with a shift in mating system to selfing. Heightened LD involves deleterious mutations, and therefore, LD can also serve as an indicator of mutation accumulation. Furthermore, we provide evidence that some genes experienced stronger increases in LD possibly due to selection associated with the two demographic changes. Our results provide a novel and broad view on the evolutionary factors shaping LD that may also apply to the very many species that underwent postglacial range expansion.
Collapse
Affiliation(s)
- Kay Lucek
- Department of Environmental Sciences, University of Basel, Basel, Switzerland
- * E-mail:
| | - Yvonne Willi
- Department of Environmental Sciences, University of Basel, Basel, Switzerland
| |
Collapse
|
6
|
Hayes K, Barton HJ, Zeng K. A Study of Faster-Z Evolution in the Great Tit (Parus major). Genome Biol Evol 2021; 12:210-222. [PMID: 32119100 PMCID: PMC7144363 DOI: 10.1093/gbe/evaa044] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/26/2020] [Indexed: 12/17/2022] Open
Abstract
Sex chromosomes contribute substantially to key evolutionary processes such as speciation and adaptation. Several theories suggest that evolution could occur more rapidly on sex chromosomes, but currently our understanding of whether and how this occurs is limited. Here, we present an analysis of the great tit (Parus major) genome, aiming to detect signals of faster-Z evolution. We find mixed evidence of faster divergence on the Z chromosome than autosomes, with significantly higher divergence being found in ancestral repeats, but not at 4- or 0-fold degenerate sites. Interestingly, some 4-fold sites appear to be selectively constrained, which may mislead analyses that use these sites as the neutral reference (e.g., dN/dS). Consistent with other studies in birds, the mutation rate is significantly higher in males than females, and the long-term Z-to-autosome effective population size ratio is only 0.5, significantly lower than the expected value of 0.75. These are indicative of male-driven evolution and high variance in male reproductive success, respectively. We find no evidence for an increased efficacy of positive selection on the Z chromosome. In contrast, the Z chromosome in great tits appears to be affected by increased genetic drift, which has led to detectable signals of weakened intensity of purifying selection. These results provide further evidence that the Z chromosome often has a low effective population size, and that this has important consequences for its evolution. They also highlight the importance of considering multiple factors that can affect the rate of evolution and effective population sizes of sex chromosomes.
Collapse
Affiliation(s)
- Kai Hayes
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| | - Henry J Barton
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom.,Organismal and Evolutionary Biology Research Program, University of Helsinki, Finland
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| |
Collapse
|
7
|
Rennison DJ, Delmore KE, Samuk K, Owens GL, Miller SE. Shared Patterns of Genome-Wide Differentiation Are More Strongly Predicted by Geography Than by Ecology. Am Nat 2020; 195:192-200. [DOI: 10.1086/706476] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
8
|
Matthey‐Doret R, Whitlock MC. Background selection andFST: Consequences for detecting local adaptation. Mol Ecol 2019; 28:3902-3914. [DOI: 10.1111/mec.15197] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Revised: 06/19/2019] [Accepted: 07/03/2019] [Indexed: 01/03/2023]
Affiliation(s)
- Remi Matthey‐Doret
- Department of Zoology and Biodiversity Research Centre University of British Columbia Vancouver BC Canada
| | - Michael C. Whitlock
- Department of Zoology and Biodiversity Research Centre University of British Columbia Vancouver BC Canada
| |
Collapse
|
9
|
Mattila TM, Laenen B, Horvath R, Hämälä T, Savolainen O, Slotte T. Impact of demography on linked selection in two outcrossing Brassicaceae species. Ecol Evol 2019; 9:9532-9545. [PMID: 31534673 PMCID: PMC6745670 DOI: 10.1002/ece3.5463] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 06/28/2019] [Accepted: 07/02/2019] [Indexed: 12/13/2022] Open
Abstract
Genetic diversity is shaped by mutation, genetic drift, gene flow, recombination, and selection. The dynamics and interactions of these forces shape genetic diversity across different parts of the genome, between populations and species. Here, we have studied the effects of linked selection on nucleotide diversity in outcrossing populations of two Brassicaceae species, Arabidopsis lyrata and Capsella grandiflora, with contrasting demographic history. In agreement with previous estimates, we found evidence for a modest population size expansion thousands of generations ago, as well as efficient purifying selection in C. grandiflora. In contrast, the A. lyrata population exhibited evidence for very recent strong population size decline and weaker efficacy of purifying selection. Using multiple regression analyses with recombination rate and other genomic covariates as explanatory variables, we can explain 47% of the variance in neutral diversity in the C. grandiflora population, while in the A. lyrata population, only 11% of the variance was explained by the model. Recombination rate had a significant positive effect on neutral diversity in both species, suggesting that selection at linked sites has an effect on patterns of neutral variation. In line with this finding, we also found reduced neutral diversity in the vicinity of genes in the C. grandiflora population. However, in A. lyrata no such reduction in diversity was evident, a finding that is consistent with expectations of the impact of a recent bottleneck on patterns of neutral diversity near genes. This study thus empirically demonstrates how differences in demographic history modulate the impact of selection at linked sites in natural populations.
Collapse
Affiliation(s)
- Tiina M. Mattila
- Department of Ecology and GeneticsUniversity of OuluOuluFinland
- Present address:
Department of Organismal BiologyUppsala UniversityUppsalaSweden
| | - Benjamin Laenen
- Science for Life Laboratory, Department of Ecology, Environment, and Plant SciencesStockholm UniversityStockholmSweden
| | - Robert Horvath
- Science for Life Laboratory, Department of Ecology, Environment, and Plant SciencesStockholm UniversityStockholmSweden
| | - Tuomas Hämälä
- Department of Ecology and GeneticsUniversity of OuluOuluFinland
- Biocenter OuluUniversity of OuluOuluFinland
- Present address:
Department of Plant and Microbial BiologyUniversity of Minnesota Twin CitiesSt. PaulMNUSA
| | - Outi Savolainen
- Department of Ecology and GeneticsUniversity of OuluOuluFinland
- Biocenter OuluUniversity of OuluOuluFinland
| | - Tanja Slotte
- Science for Life Laboratory, Department of Ecology, Environment, and Plant SciencesStockholm UniversityStockholmSweden
| |
Collapse
|
10
|
The Effects on Neutral Variability of Recurrent Selective Sweeps and Background Selection. Genetics 2019; 212:287-303. [PMID: 30923166 DOI: 10.1534/genetics.119.301951] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 03/19/2019] [Indexed: 12/11/2022] Open
Abstract
Levels of variability and rates of adaptive evolution may be affected by hitchhiking, the effect of selection on evolution at linked sites. Hitchhiking can be caused either by "selective sweeps" or by background selection, involving the spread of new favorable alleles or the elimination of deleterious mutations, respectively. Recent analyses of population genomic data have fitted models where both these processes act simultaneously, to infer the parameters of selection. Here, we investigate the consequences of relaxing a key assumption of some of these studies, that the time occupied by a selective sweep is negligible compared with the neutral coalescent time. We derive a new expression for the expected level of neutral variability in the presence of recurrent selective sweeps and background selection. We also derive approximate integral expressions for the effects of recurrent selective sweeps. The accuracy of the theoretical predictions was tested against multilocus simulations, with selection, recombination, and mutation parameters that are realistic for Drosophila melanogaster In the presence of crossing over, there is approximate agreement between the theoretical and simulation results. We show that the observed relationships between the rate of crossing over, and the level of synonymous site diversity and rate of adaptive evolution in Drosophila are probably mainly caused by background selection, whereas selective sweeps and population size changes are needed to produce the observed distortions of the site frequency spectrum.
Collapse
|
11
|
Shi CM, Yang Z. Coalescent-Based Analyses of Genomic Sequence Data Provide a Robust Resolution of Phylogenetic Relationships among Major Groups of Gibbons. Mol Biol Evol 2019; 35:159-179. [PMID: 29087487 PMCID: PMC5850733 DOI: 10.1093/molbev/msx277] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The phylogenetic relationships among extant gibbon species remain unresolved despite numerous efforts using morphological, behavorial, and genetic data and the sequencing of whole genomes. A major challenge in reconstructing the gibbon phylogeny is the radiative speciation process, which resulted in extremely short internal branches in the species phylogeny and extensive incomplete lineage sorting with extensive gene-tree heterogeneity across the genome. Here, we analyze two genomic-scale data sets, with ∼10,000 putative noncoding and exonic loci, respectively, to estimate the species tree for the major groups of gibbons. We used the Bayesian full-likelihood method bpp under the multispecies coalescent model, which naturally accommodates incomplete lineage sorting and uncertainties in the gene trees. For comparison, we included three heuristic coalescent-based methods (mp-est, SVDQuartets, and astral) as well as concatenation. From both data sets, we infer the phylogeny for the four extant gibbon genera to be (Hylobates, (Nomascus, (Hoolock, Symphalangus))). We used simulation guided by the real data to evaluate the accuracy of the methods used. Astral, while not as efficient as bpp, performed well in estimation of the species tree even in presence of excessive incomplete lineage sorting. Concatenation, mp-est and SVDQuartets were unreliable when the species tree contains very short internal branches. Likelihood ratio test of gene flow suggests a small amount of migration from Hylobates moloch to H. pileatus, while cross-genera migration is absent or rare. Our results highlight the utility of coalescent-based methods in addressing challenging species tree problems characterized by short internal branches and rampant gene tree-species tree discordance.
Collapse
Affiliation(s)
- Cheng-Min Shi
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.,Radcliffe Institute for Advanced Studies, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
12
|
Irwin DE, Milá B, Toews DPL, Brelsford A, Kenyon HL, Porter AN, Grossen C, Delmore KE, Alcaide M, Irwin JH. A comparison of genomic islands of differentiation across three young avian species pairs. Mol Ecol 2018; 27:4839-4855. [PMID: 30187980 DOI: 10.1111/mec.14858] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 07/10/2018] [Accepted: 07/14/2018] [Indexed: 02/06/2023]
Abstract
Detailed evaluations of genomic variation between sister species often reveal distinct chromosomal regions of high relative differentiation (i.e., "islands of differentiation" in FST ), but there is much debate regarding the causes of this pattern. We briefly review the prominent models of genomic islands of differentiation and compare patterns of genomic differentiation in three closely related pairs of New World warblers with the goal of evaluating support for the four models. Each pair (MacGillivray's/mourning warblers; Townsend's/black-throated green warblers; and Audubon's/myrtle warblers) consists of forms that were likely separated in western and eastern North American refugia during cycles of Pleistocene glaciations and have now come into contact in western Canada, where each forms a narrow hybrid zone. We show strong differences between pairs in their patterns of genomic heterogeneity in FST , suggesting differing selective forces and/or differing genomic responses to similar selective forces among the three pairs. Across most of the genome, levels of within-group nucleotide diversity (πWithin ) are almost as large as levels of between-group nucleotide distance (πBetween ) within each pair, suggesting recent common ancestry and/or gene flow. In two pairs, a pattern of the FST peaks having low πBetween suggests that selective sweeps spread between geographically differentiated groups, followed by local differentiation. This "sweep-before-differentiation" model is consistent with signatures of gene flow within the yellow-rumped warbler species complex. These findings add to our growing understanding of speciation as a complex process that can involve phases of adaptive introgression among partially differentiated populations.
Collapse
Affiliation(s)
- Darren E Irwin
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Borja Milá
- National Museum of Natural Sciences, Spanish National Research Council (CSIC), Madrid, Spain
| | - David P L Toews
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Cornell Lab of Ornithology & Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, New York
| | - Alan Brelsford
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Evolution, Ecology and Organismal Biology, University of California, Riverside, California
| | - Haley L Kenyon
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Biology, Queen's University, Biosciences Complex, Kingston, Ontario, Canada
| | - Alison N Porter
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Christine Grossen
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Evolutionary Biology and Environmental Studies, University of Zürich, Zürich, Switzerland
| | - Kira E Delmore
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Max Planck Institute for Evolutionary Biology, Behavioural Genomics, Plön, Germany
| | - Miguel Alcaide
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Jessica H Irwin
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
13
|
Pouyet F, Aeschbacher S, Thiéry A, Excoffier L. Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences. eLife 2018; 7:e36317. [PMID: 30125248 PMCID: PMC6177262 DOI: 10.7554/elife.36317] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 08/17/2018] [Indexed: 12/15/2022] Open
Abstract
Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.
Collapse
Affiliation(s)
- Fanny Pouyet
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Simon Aeschbacher
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
| | - Alexandre Thiéry
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Laurent Excoffier
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| |
Collapse
|
14
|
Comeron JM. Background selection as null hypothesis in population genomics: insights and challenges from Drosophila studies. Philos Trans R Soc Lond B Biol Sci 2018; 372:rstb.2016.0471. [PMID: 29109230 PMCID: PMC5698629 DOI: 10.1098/rstb.2016.0471] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/04/2017] [Indexed: 12/11/2022] Open
Abstract
The consequences of selection at linked sites are multiple and widespread across the genomes of most species. Here, I first review the main concepts behind models of selection and linkage in recombining genomes, present the difficulty in parametrizing these models simply as a reduction in effective population size (Ne) and discuss the predicted impact of recombination rates on levels of diversity across genomes. Arguments are then put forward in favour of using a model of selection and linkage with neutral and deleterious mutations (i.e. the background selection model, BGS) as a sensible null hypothesis for investigating the presence of other forms of selection, such as balancing or positive. I also describe and compare two studies that have generated high-resolution landscapes of the predicted consequences of selection at linked sites in Drosophila melanogaster. Both studies show that BGS can explain a very large fraction of the observed variation in diversity across the whole genome, thus supporting its use as null model. Finally, I identify and discuss a number of caveats and challenges in studies of genetic hitchhiking that have been often overlooked, with several of them sharing a potential bias towards overestimating the evidence supporting recent selective sweeps to the detriment of a BGS explanation. One potential source of bias is the analysis of non-equilibrium populations: it is precisely because models of selection and linkage predict variation in Ne across chromosomes that demographic dynamics are not expected to be equivalent chromosome- or genome-wide. Other challenges include the use of incomplete genome annotations, the assumption of temporally stable recombination landscapes, the presence of genes under balancing selection and the consequences of ignoring non-crossover (gene conversion) recombination events. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’.
Collapse
Affiliation(s)
- Josep M Comeron
- Department of Biology, University of Iowa, Iowa City, IA 52242, USA .,Interdisciplinary Program in Genetics, University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
15
|
The Effect of Strong Purifying Selection on Genetic Diversity. Genetics 2018; 209:1235-1278. [PMID: 29844134 DOI: 10.1534/genetics.118.301058] [Citation(s) in RCA: 124] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 05/25/2018] [Indexed: 12/15/2022] Open
Abstract
Purifying selection reduces genetic diversity, both at sites under direct selection and at linked neutral sites. This process, known as background selection, is thought to play an important role in shaping genomic diversity in natural populations. Yet despite its importance, the effects of background selection are not fully understood. Previous theoretical analyses of this process have taken a backward-time approach based on the structured coalescent. While they provide some insight, these methods are either limited to very small samples or are computationally prohibitive. Here, we present a new forward-time analysis of the trajectories of both neutral and deleterious mutations at a nonrecombining locus. We find that strong purifying selection leads to remarkably rich dynamics: neutral mutations can exhibit sweep-like behavior, and deleterious mutations can reach substantial frequencies even when they are guaranteed to eventually go extinct. Our analysis of these dynamics allows us to calculate analytical expressions for the full site frequency spectrum. We find that whenever background selection is strong enough to lead to a reduction in genetic diversity, it also results in substantial distortions to the site frequency spectrum, which can mimic the effects of population expansions or positive selection. Because these distortions are most pronounced in the low and high frequency ends of the spectrum, they become particularly important in larger samples, but may have small effects in smaller samples. We also apply our forward-time framework to calculate other quantities, such as the ultimate fates of polymorphisms or the fitnesses of their ancestral backgrounds.
Collapse
|
16
|
Irwin DE. Sex chromosomes and speciation in birds and other ZW systems. Mol Ecol 2018; 27:3831-3851. [PMID: 29443419 DOI: 10.1111/mec.14537] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 02/03/2018] [Accepted: 02/06/2018] [Indexed: 01/01/2023]
Abstract
Theory and empirical patterns suggest a disproportionate role for sex chromosomes in evolution and speciation. Focusing on ZW sex determination (females ZW, males ZZ; the system in birds, many snakes, and lepidopterans), I review how evolutionary dynamics are expected to differ between the Z, W and the autosomes, discuss how these differences may lead to a greater role of the sex chromosomes in speciation and use data from birds to compare relative evolutionary rates of sex chromosomes and autosomes. Neutral mutations, partially or completely recessive beneficial mutations, and deleterious mutations under many conditions are expected to accumulate faster on the Z than on autosomes. Sexually antagonistic polymorphisms are expected to arise on the Z, raising the possibility of the spread of preference alleles. The faster accumulation of many types of mutations and the potential for complex evolutionary dynamics of sexually antagonistic traits and preferences contribute to a role for the Z chromosome in speciation. A quantitative comparison among a wide variety of bird species shows that the Z tends to have less within-population diversity and greater between-species differentiation than the autosomes, likely due to both adaptive evolution and a greater rate of fixation of deleterious alleles. The W chromosome also shows strong potential to be involved in speciation, in part because of its co-inheritance with the mitochondrial genome. While theory and empirical evidence suggest a disproportionate role for sex chromosomes in speciation, the importance of sex chromosomes is moderated by their small size compared to the whole genome.
Collapse
Affiliation(s)
- Darren E Irwin
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
17
|
Burri R. Interpreting differentiation landscapes in the light of long-term linked selection. Evol Lett 2017. [DOI: 10.1002/evl3.14] [Citation(s) in RCA: 116] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- Reto Burri
- Department of Population Ecology; Friedrich Schiller University Jena; Dornburger Strasse 159 D-07743 Jena Germany
| |
Collapse
|
18
|
Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples. Genetics 2017; 206:345-361. [PMID: 28249985 PMCID: PMC5419480 DOI: 10.1534/genetics.116.197145] [Citation(s) in RCA: 115] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Accepted: 02/14/2017] [Indexed: 12/23/2022] Open
Abstract
The distribution of fitness effects (DFE) has considerable importance in population genetics. To date, estimates of the DFE come from studies using a small number of individuals. Thus, estimates of the proportion of moderately to strongly deleterious new mutations may be unreliable because such variants are unlikely to be segregating in the data. Additionally, the true functional form of the DFE is unknown, and estimates of the DFE differ significantly between studies. Here we present a flexible and computationally tractable method, called Fit∂a∂i, to estimate the DFE of new mutations using the site frequency spectrum from a large number of individuals. We apply our approach to the frequency spectrum of 1300 Europeans from the Exome Sequencing Project ESP6400 data set, 1298 Danes from the LuCamp data set, and 432 Europeans from the 1000 Genomes Project to estimate the DFE of deleterious nonsynonymous mutations. We infer significantly fewer (0.38-0.84 fold) strongly deleterious mutations with selection coefficient |s| > 0.01 and more (1.24-1.43 fold) weakly deleterious mutations with selection coefficient |s| < 0.001 compared to previous estimates. Furthermore, a DFE that is a mixture distribution of a point mass at neutrality plus a gamma distribution fits better than a gamma distribution in two of the three data sets. Our results suggest that nearly neutral forces play a larger role in human evolution than previously thought.
Collapse
|
19
|
Charlesworth et al. on Background Selection and Neutral Diversity. Genetics 2017; 204:829-832. [PMID: 28114095 DOI: 10.1534/genetics.116.196170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
20
|
Sugden LA, Ramachandran S. Integrating the signatures of demic expansion and archaic introgression in studies of human population genomics. Curr Opin Genet Dev 2016; 41:140-149. [PMID: 27743539 DOI: 10.1016/j.gde.2016.09.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 09/19/2016] [Accepted: 09/23/2016] [Indexed: 12/12/2022]
Abstract
Human population genomic studies have repeatedly observed a decrease in heterozygosity and an increase in linkage disequilibrium with geographic distance from Africa. While multiple demographic models can generate these patterns, many studies invoke the serial founder effect model, in which populations expand from a single origin and each new population's founders represent a subset of genetic variation in the previous population. The model assumes no admixture with archaic hominins, however, recent studies have identified loci in Homo sapiens bearing signatures of archaic introgression. These results appear to contradict the validity of analyses invoking the serial founder effect model, but we show these two perspectives are compatible. We also propose using the serial founder effect model as a null model for determining the signature of archaic admixture in modern human genomes at different geographic and genomic scales.
Collapse
Affiliation(s)
- Lauren Alpert Sugden
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA; Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA; Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA.
| |
Collapse
|
21
|
On the importance of skewed offspring distributions and background selection in virus population genetics. Heredity (Edinb) 2016; 117:393-399. [PMID: 27649621 DOI: 10.1038/hdy.2016.58] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/08/2016] [Indexed: 12/16/2022] Open
Abstract
Many features of virus populations make them excellent candidates for population genetic study, including a very high rate of mutation, high levels of nucleotide diversity, exceptionally large census population sizes, and frequent positive selection. However, these attributes also mean that special care must be taken in population genetic inference. For example, highly skewed offspring distributions, frequent and severe population bottleneck events associated with infection and compartmentalization, and strong purifying selection all affect the distribution of genetic variation but are often not taken into account. Here, we draw particular attention to multiple-merger coalescent events and background selection, discuss potential misinference associated with these processes, and highlight potential avenues for better incorporating them into future population genetic analyses.
Collapse
|
22
|
Irwin DE, Alcaide M, Delmore KE, Irwin JH, Owens GL. Recurrent selection explains parallel evolution of genomic regions of high relative but low absolute differentiation in a ring species. Mol Ecol 2016; 25:4488-507. [PMID: 27484941 DOI: 10.1111/mec.13792] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 07/21/2016] [Accepted: 07/25/2016] [Indexed: 12/13/2022]
Abstract
Recent technological developments allow investigation of the repeatability of evolution at the genomic level. Such investigation is particularly powerful when applied to a ring species, in which spatial variation represents changes during the evolution of two species from one. We examined genomic variation among three subspecies of the greenish warbler ring species, using genotypes at 13 013 950 nucleotide sites along a new greenish warbler consensus genome assembly. Genomic regions of low within-group variation are remarkably consistent between the three populations. These regions show high relative differentiation but low absolute differentiation between populations. Comparisons with outgroup species show the locations of these peaks of relative differentiation are not well explained by phylogenetically conserved variation in recombination rates or selection. These patterns are consistent with a model in which selection in an ancestral form has reduced variation at some parts of the genome, and those same regions experience recurrent selection that subsequently reduces variation within each subspecies. The degree of heterogeneity in nucleotide diversity is greater than explained by models of background selection, but is consistent with selective sweeps. Given the evidence that greenish warblers have had both population differentiation for a long period of time and periods of gene flow between those populations, we propose that some genomic regions underwent selective sweeps over a broad geographic area followed by within-population selection-induced reductions in variation. An important implication of this 'sweep-before-differentiation' model is that genomic regions of high relative differentiation may have moved among populations more recently than other genomic regions.
Collapse
Affiliation(s)
- Darren E Irwin
- Department of Zoology and Biodiversity Research Center, University of British Columbia, 6270 University Blvd., Vancouver, BC, V6T 1Z4, Canada.
| | - Miguel Alcaide
- Department of Zoology and Biodiversity Research Center, University of British Columbia, 6270 University Blvd., Vancouver, BC, V6T 1Z4, Canada
| | - Kira E Delmore
- Department of Zoology and Biodiversity Research Center, University of British Columbia, 6270 University Blvd., Vancouver, BC, V6T 1Z4, Canada
| | - Jessica H Irwin
- Department of Zoology and Biodiversity Research Center, University of British Columbia, 6270 University Blvd., Vancouver, BC, V6T 1Z4, Canada
| | - Gregory L Owens
- Department of Zoology and Biodiversity Research Center, University of British Columbia, 6270 University Blvd., Vancouver, BC, V6T 1Z4, Canada
| |
Collapse
|
23
|
Agrawal AF, Hartfield M. Coalescence with Background and Balancing Selection in Systems with Bi- and Uniparental Reproduction: Contrasting Partial Asexuality and Selfing. Genetics 2016; 202:313-26. [PMID: 26584901 PMCID: PMC4701095 DOI: 10.1534/genetics.115.181024] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 11/13/2015] [Indexed: 11/18/2022] Open
Abstract
Uniparental reproduction in diploids, via asexual reproduction or selfing, reduces the independence with which separate loci are transmitted across generations. This is expected to increase the extent to which a neutral marker is affected by selection elsewhere in the genome. Such effects have previously been quantified in coalescent models involving selfing. Here we examine the effects of background selection and balancing selection in diploids capable of both sexual and asexual reproduction (i.e., partial asexuality). We find that the effect of background selection on reducing coalescent time (and effective population size) can be orders of magnitude greater when rates of sex are low than when sex is common. This is because asexuality enhances the effects of background selection through both a recombination effect and a segregation effect. We show that there are several reasons that the strength of background selection differs between systems with partial asexuality and those with comparable levels of uniparental reproduction via selfing. Expectations for reductions in Ne via background selection have been verified using stochastic simulations. In contrast to background selection, balancing selection increases the coalescence time for a linked neutral site. With partial asexuality, the effect of balancing selection is somewhat dependent upon the mode of selection (e.g., heterozygote advantage vs. negative frequency-dependent selection) in a manner that does not apply to selfing. This is because the frequency of heterozygotes, which are required for recombination onto alternative genetic backgrounds, is more dependent on the pattern of selection with partial asexuality than with selfing.
Collapse
Affiliation(s)
- Aneil F Agrawal
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3G5, Canada
| | - Matthew Hartfield
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3G5, Canada Bioinformatics Research Centre, University of Aarhus, 8000C Aarhus, Denmark
| |
Collapse
|
24
|
Weakly Deleterious Mutations and Low Rates of Recombination Limit the Impact of Natural Selection on Bacterial Genomes. mBio 2015; 6:e01302-15. [PMID: 26670382 PMCID: PMC4701828 DOI: 10.1128/mbio.01302-15] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Free-living bacteria are usually thought to have large effective population sizes, and so tiny selective differences can drive their evolution. However, because recombination is infrequent, “background selection” against slightly deleterious alleles should reduce the effective population size (Ne) by orders of magnitude. For example, for a well-mixed population with 1012 individuals and a typical level of homologous recombination (r/m = 3, i.e., nucleotide changes due to recombination [r] occur at 3 times the mutation rate [m]), we predict that Ne is <107. An argument for high Ne values for bacteria has been the high genetic diversity within many bacterial “species,” but this diversity may be due to population structure: diversity across subpopulations can be far higher than diversity within a subpopulation, which makes it difficult to estimate Ne correctly. Given an estimate of Ne, standard population genetics models imply that selection should be sufficient to drive evolution if Ne × s is >1, where s is the selection coefficient. We found that this remains approximately correct if background selection is occurring or when population structure is present. Overall, we predict that even for free-living bacteria with enormous populations, natural selection is only a significant force if s is above 10−7 or so. Because bacteria form huge populations with trillions of individuals, the simplest theoretical prediction is that the better allele at a site would predominate even if its advantage was just 10−9 per generation. In other words, virtually every nucleotide would be at the local optimum in most individuals. A more sophisticated theory considers that bacterial genomes have millions of sites each and selection events on these many sites could interfere with each other, so that only larger effects would be important. However, bacteria can exchange genetic material, and in principle, this exchange could eliminate the interference between the evolution of the sites. We used simulations to confirm that during multisite evolution with realistic levels of recombination, only larger effects are important. We propose that advantages of less than 10−7 are effectively neutral.
Collapse
|
25
|
Ewing GB, Jensen JD. The consequences of not accounting for background selection in demographic inference. Mol Ecol 2015; 25:135-41. [PMID: 26394805 DOI: 10.1111/mec.13390] [Citation(s) in RCA: 109] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 08/05/2015] [Accepted: 08/25/2015] [Indexed: 12/11/2022]
Abstract
Recently, there has been increased awareness of the role of background selection (BGS) in both data analysis and modelling advances. However, BGS is still difficult to take into account because of tractability issues with simulations and difficulty with nonequilibrium demographic models. Often, simple rescaling adjustments of effective population size are used. However, there has been neither a proper characterization of how BGS could bias or shift inference when not properly taken into account, nor a thorough analysis of whether rescaling is a sufficient solution. Here, we carry out extensive simulations with BGS to determine biases and behaviour of demographic inference using an approximate Bayesian approach. We find that results can be positively misleading with significant bias, and describe the parameter space in which BGS models replicate observed neutral nonequilibrium expectations.
Collapse
Affiliation(s)
- Gregory B Ewing
- Ecole Polytechnique Fédérale de Lausanne (EPFL), EPFL SV IBI-SV UPJENSEN, AAB 0 46, Station 15, CH 1015, Lausanne, Switzerland.,Swiss Institute of Bioinformatics (SIB), EPFL SV IBI-SV UPJENSEN, AAB 0 46, Station 15, CH 1015, Lausanne, Switzerland
| | - Jeffrey D Jensen
- Ecole Polytechnique Fédérale de Lausanne (EPFL), EPFL SV IBI-SV UPJENSEN, AAB 0 46, Station 15, CH 1015, Lausanne, Switzerland.,Swiss Institute of Bioinformatics (SIB), EPFL SV IBI-SV UPJENSEN, AAB 0 46, Station 15, CH 1015, Lausanne, Switzerland
| |
Collapse
|
26
|
The Effects of Background and Interference Selection on Patterns of Genetic Variation in Subdivided Populations. Genetics 2015; 201:1539-54. [PMID: 26434720 DOI: 10.1534/genetics.115.178558] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Accepted: 09/24/2015] [Indexed: 11/18/2022] Open
Abstract
It is well known that most new mutations that affect fitness exert deleterious effects and that natural populations are often composed of subpopulations (demes) connected by gene flow. To gain a better understanding of the joint effects of purifying selection and population structure, we focus on a scenario where an ancestral population splits into multiple demes and study neutral diversity patterns in regions linked to selected sites. In the background selection regime of strong selection, we first derive analytic equations for pairwise coalescent times and FST as a function of time after the ancestral population splits into two demes and then construct a flexible coalescent simulator that can generate samples under complex models such as those involving multiple demes or nonconservative migration. We have carried out extensive forward simulations to show that the new methods can accurately predict diversity patterns both in the nonequilibrium phase following the split of the ancestral population and in the equilibrium between mutation, migration, drift, and selection. In the interference selection regime of many tightly linked selected sites, forward simulations provide evidence that neutral diversity patterns obtained from both the nonequilibrium and equilibrium phases may be virtually indistinguishable for models that have identical variance in fitness, but are nonetheless different with respect to the number of selected sites and the strength of purifying selection. This equivalence in neutral diversity patterns suggests that data collected from subdivided populations may have limited power for differentiating among the selective pressures to which closely linked selected sites are subject.
Collapse
|
27
|
Huber CD, DeGiorgio M, Hellmann I, Nielsen R. Detecting recent selective sweeps while controlling for mutation rate and background selection. Mol Ecol 2015; 25:142-56. [PMID: 26290347 PMCID: PMC5082542 DOI: 10.1111/mec.13351] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/31/2015] [Accepted: 08/17/2015] [Indexed: 12/19/2022]
Abstract
A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson–Kreitman–Aguadé test, we suggest adding fixed differences relative to an out‐group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.
Collapse
Affiliation(s)
- Christian D Huber
- Max F. Perutz Laboratory, University of Vienna, Vienna, Austria.,Vienna Graduate School of Population Genetics, University of Veterinary Medicine, Vienna, Austria.,Department of Ecology and Evolutionary Biology, University of California, Los Angeles, 621 Charles E. Young Drive South, Los Angeles, CA, 90095-1606, USA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University, University Park, PA, USA.,Institute for CyberScience, Pennsylvania State University, University Park, PA, USA
| | - Ines Hellmann
- Department Biologie II, Ludwig-Maximilians-Universität München, Großhaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Rasmus Nielsen
- Departments of Integrative Biology and Statistics, University of California, Berkeley, CA, USA
| |
Collapse
|
28
|
Uricchio LH, Torres R, Witte JS, Hernandez RD. Population genetic simulations of complex phenotypes with implications for rare variant association tests. Genet Epidemiol 2014; 39:35-44. [PMID: 25417809 DOI: 10.1002/gepi.21866] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Revised: 09/09/2014] [Accepted: 09/26/2014] [Indexed: 12/12/2022]
Abstract
Demographic events and natural selection alter patterns of genetic variation within populations and may play a substantial role in shaping the genetic architecture of complex phenotypes and disease. However, the joint impact of these basic evolutionary forces is often ignored in the assessment of statistical tests of association. Here, we provide a simulation-based framework for generating DNA sequences that incorporates selection and demography with flexible models for simulating phenotypic variation (sfs_coder). This tool also allows the user to perform locus-specific simulations by automatically querying annotated genomic functional elements and genetic maps. We demonstrate the effects of evolutionary forces on patterns of genetic variation by simulating recently inferred models of human selection and demography. We use these simulations to show that the demographic model and locus-specific features, such as the proportion of sites under selection, may have practical implications for estimating the statistical power of sequencing-based rare variant association tests. In particular, for some phenotype models, there may be higher power to detect rare variant associations in African populations compared to non-Africans, but power is considerably reduced in regions of the genome with rampant negative selection. Furthermore, we show that existing methods for simulating large samples based on resampling from a small set of observed haplotypes fail to recapitulate the distribution of rare variants in the presence of rapid population growth (as has been observed in several human populations).
Collapse
Affiliation(s)
- Lawrence H Uricchio
- Graduate Program in Bioinformatics, University of California, San Francisco, California, United States of America
| | | | | | | |
Collapse
|
29
|
Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet 2014; 30:540-6. [PMID: 25438719 DOI: 10.1016/j.tig.2014.09.010] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 09/19/2014] [Accepted: 09/23/2014] [Indexed: 02/03/2023]
Abstract
In the age of next-generation sequencing, the availability of increasing amounts and improved quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. However, alternative forces, such as demography and background selection (BGS), obscure the footprints of positive selection that we would like to identify. In this review, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (i) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (ii) that genomic information from multiple time points will enhance the power of inference, and (iii) that results from experimental evolution should be utilized to better inform population genomic studies.
Collapse
Affiliation(s)
- Claudia Bank
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland.
| | - Gregory B Ewing
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Anna Ferrer-Admettla
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland; Department of Biology and Biochemistry, University of Fribourg, 1700 Fribourg, Switzerland
| | - Matthieu Foll
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Jeffrey D Jensen
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| |
Collapse
|
30
|
Jackson BC, Campos JL, Zeng K. The effects of purifying selection on patterns of genetic differentiation between Drosophila melanogaster populations. Heredity (Edinb) 2014; 114:163-74. [PMID: 25227256 PMCID: PMC4270736 DOI: 10.1038/hdy.2014.80] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Revised: 06/16/2014] [Accepted: 07/22/2014] [Indexed: 01/21/2023] Open
Abstract
Using the data provided by the Drosophila Population Genomics Project, we investigate factors that affect the genetic differentiation between Rwandan and French populations of D. melanogaster. By examining within-population polymorphisms, we show that sites in long introns (especially those >2000 bp) have significantly lower π (nucleotide diversity) and more low-frequency variants (as measured by Tajima's D, minor allele frequencies, and prevalence of variants that are private to one of the two populations) than short introns, suggesting a positive relationship between intron length and selective constraint. A similar analysis of protein-coding polymorphisms shows that 0-fold (degenerate) sites in more conserved genes are under stronger purifying selection than those in less conserved genes. There is limited evidence that selection on codon bias has an effect on differentiation (as measured by FST) at 4-fold (degenerate) sites, and 4-fold sites and sites in 8–30 bp of short introns ⩽65 bp have comparable FST values. Consistent with the expected effect of purifying selection, sites in long introns and 0-fold sites in conserved genes are less differentiated than those in short introns and less conserved genes, respectively. Genes in non-crossover regions (for example, the fourth chromosome) have very high FST values at both 0-fold and 4-fold degenerate sites, which is probably because of the large reduction in within-population diversity caused by tight linkage between many selected sites. Our analyses also reveal subtle statistical properties of FST, which arise when information from multiple single nucleotide polymorphisms is combined and can lead to the masking of important signals of selection.
Collapse
Affiliation(s)
- B C Jackson
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - J L Campos
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - K Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| |
Collapse
|
31
|
Evans BJ, Zeng K, Esselstyn JA, Charlesworth B, Melnick DJ. Reduced representation genome sequencing suggests low diversity on the sex chromosomes of tonkean macaque monkeys. Mol Biol Evol 2014; 31:2425-40. [PMID: 24987106 DOI: 10.1093/molbev/msu197] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
In species with separate sexes, social systems can differ in the relative variances of male versus female reproductive success. Papionin monkeys (macaques, mangabeys, mandrills, drills, baboons, and geladas) exhibit hallmarks of a high variance in male reproductive success, including a female-biased adult sex ratio and prominent sexual dimorphism. To explore the potential genomic consequences of such sex differences, we used a reduced representation genome sequencing approach to quantifying polymorphism at sites on autosomes and sex chromosomes of the tonkean macaque (Macaca tonkeana), a species endemic to the Indonesian island of Sulawesi. The ratio of nucleotide diversity of the X chromosome to that of the autosomes was less than the value (0.75) expected with a 1:1 sex ratio and no sex differences in the variance in reproductive success. However, the significance of this difference was dependent on which outgroup was used to standardize diversity levels. Using a new model that includes the effects of varying population size, sex differences in mutation rate between the autosomes and X chromosome, and GC-biased gene conversion (gBGC) or selection on GC content, we found that the maximum-likelihood estimate of the ratio of effective population size of the X chromosome to that of the autosomes was 0.68, which did not differ significantly from 0.75. We also found evidence for 1) a higher level of purifying selection on genic than nongenic regions, 2) gBGC or natural selection favoring increased GC content, 3) a dynamic demography characterized by population growth and contraction, 4) a higher mutation rate in males than females, and 5) a very low polymorphism level on the Y chromosome. These findings shed light on the population genomic consequences of sex differences in the variance in reproductive success, which appear to be modest in the tonkean macaque; they also suggest the occurrence of hitchhiking on the Y chromosome.
Collapse
Affiliation(s)
- Ben J Evans
- Biology Department, McMaster University, Hamilton, ON, Canada
| | - Kai Zeng
- Department of Animal and Plant Sciences, Alfred Denny Building, University of Sheffield, Sheffield, United Kingdom
| | - Jacob A Esselstyn
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Don J Melnick
- Department of Ecology, Evolution, and Environmental Biology, Columbia University
| |
Collapse
|
32
|
Weissman DB, Hallatschek O. The rate of adaptation in large sexual populations with linear chromosomes. Genetics 2014; 196:1167-83. [PMID: 24429280 PMCID: PMC3982688 DOI: 10.1534/genetics.113.160705] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 01/11/2014] [Indexed: 12/21/2022] Open
Abstract
In large populations, multiple beneficial mutations may be simultaneously spreading. In asexual populations, these mutations must either arise on the same background or compete against each other. In sexual populations, recombination can bring together beneficial alleles from different backgrounds, but tightly linked alleles may still greatly interfere with each other. We show for well-mixed populations that when this interference is strong, the genome can be seen as consisting of many effectively asexual stretches linked together. The rate at which beneficial alleles fix is thus roughly proportional to the rate of recombination and depends only logarithmically on the mutation supply and the strength of selection. Our scaling arguments also allow us to predict, with reasonable accuracy, the fitness distribution of fixed mutations when the mutational effect sizes are broad. We focus on the regime in which crossovers occur more frequently than beneficial mutations, as is likely to be the case for many natural populations.
Collapse
Affiliation(s)
- Daniel B. Weissman
- Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria
- Simons Institute for the Theory of Computing, University of California, Berkeley, California, 94720
| | - Oskar Hallatschek
- Department of Physics, University of California, Berkeley, California, 94720
| |
Collapse
|
33
|
Good BH, Walczak AM, Neher RA, Desai MM. Genetic diversity in the interference selection limit. PLoS Genet 2014; 10:e1004222. [PMID: 24675740 PMCID: PMC3967937 DOI: 10.1371/journal.pgen.1004222] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 01/22/2014] [Indexed: 01/23/2023] Open
Abstract
Pervasive natural selection can strongly influence observed patterns of genetic variation, but these effects remain poorly understood when multiple selected variants segregate in nearby regions of the genome. Classical population genetics fails to account for interference between linked mutations, which grows increasingly severe as the density of selected polymorphisms increases. Here, we describe a simple limit that emerges when interference is common, in which the fitness effects of individual mutations play a relatively minor role. Instead, similar to models of quantitative genetics, molecular evolution is determined by the variance in fitness within the population, defined over an effectively asexual segment of the genome (a "linkage block"). We exploit this insensitivity in a new "coarse-grained" coalescent framework, which approximates the effects of many weakly selected mutations with a smaller number of strongly selected mutations that create the same variance in fitness. This approximation generates accurate and efficient predictions for silent site variability when interference is common. However, these results suggest that there is reduced power to resolve individual selection pressures when interference is sufficiently widespread, since a broad range of parameters possess nearly identical patterns of silent site variability.
Collapse
Affiliation(s)
- Benjamin H. Good
- Departments of Organismic and Evolutionary Biology and of Physics, Harvard University, Cambridge, Massachusetts, United States of America
- FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | | | - Richard A. Neher
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Michael M. Desai
- Departments of Organismic and Evolutionary Biology and of Physics, Harvard University, Cambridge, Massachusetts, United States of America
- FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
34
|
Abstract
Purifying selection at many linked sites alters patterns of molecular evolution, reducing overall diversity and distorting the shapes of genealogies. Recombination attenuates these effects; however, purifying selection can significantly distort genealogies even for substantial recombination rates. Here, we show that when selection and/or recombination are sufficiently strong, the genealogy at any single site can be described by a time-dependent effective population size, Ne(t), which has a simple analytic form. Our results illustrate how recombination reduces distortions in genealogies and allow us to quantitatively describe the shapes of genealogies in the presence of strong purifying selection and recombination. We also analyze the effects of a distribution of selection coefficients across the genome.
Collapse
|
35
|
Genomic signatures of selection at linked sites: unifying the disparity among species. Nat Rev Genet 2013; 14:262-74. [PMID: 23478346 DOI: 10.1038/nrg3425] [Citation(s) in RCA: 315] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Population genetics theory supplies powerful predictions about how natural selection interacts with genetic linkage to sculpt the genomic landscape of nucleotide polymorphism. Both the spread of beneficial mutations and the removal of deleterious mutations act to depress polymorphism levels, especially in low-recombination regions. However, empiricists have documented extreme disparities among species. Here we characterize the dominant features that could drive differences in linked selection among species--including roles for selective sweeps being 'hard' or 'soft'--and the concealing effects of demography and confounding genomic variables. We advocate targeted studies of closely related species to unify our understanding of how selection and linkage interact to shape genome evolution.
Collapse
|
36
|
Good BH, Desai MM. Fluctuations in fitness distributions and the effects of weak linked selection on sequence evolution. Theor Popul Biol 2013; 85:86-102. [PMID: 23337315 DOI: 10.1016/j.tpb.2013.01.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Revised: 01/02/2013] [Accepted: 01/11/2013] [Indexed: 02/02/2023]
Abstract
Evolutionary dynamics and patterns of molecular evolution are strongly influenced by selection on linked regions of the genome, but our quantitative understanding of these effects remains incomplete. Recent work has focused on predicting the distribution of fitness within an evolving population, and this forms the basis for several methods that leverage the fitness distribution to predict the patterns of genetic diversity when selection is strong. However, in weakly selected populations random fluctuations due to genetic drift are more severe, and neither the distribution of fitness nor the sequence diversity within the population are well understood. Here, we briefly review the motivations behind the fitness-distribution picture, and summarize the general approaches that have been used to analyze this distribution in the strong-selection regime. We then extend these approaches to the case of weak selection, by outlining a perturbative treatment of selection at a large number of linked sites. This allows us to quantify the stochastic behavior of the fitness distribution and yields exact analytical predictions for the sequence diversity and substitution rate in the limit that selection is weak.
Collapse
Affiliation(s)
- Benjamin H Good
- Department of Organismic and Evolutionary Biology, Department of Physics, and FAS Center for Systems Biology, Harvard University, United States
| | | |
Collapse
|
37
|
|
38
|
Pool JE, Corbett-Detig RB, Sugino RP, Stevens KA, Cardeno CM, Crepeau MW, Duchen P, Emerson JJ, Saelao P, Begun DJ, Langley CH. Population Genomics of sub-saharan Drosophila melanogaster: African diversity and non-African admixture. PLoS Genet 2012; 8:e1003080. [PMID: 23284287 PMCID: PMC3527209 DOI: 10.1371/journal.pgen.1003080] [Citation(s) in RCA: 229] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 09/27/2012] [Indexed: 11/25/2022] Open
Abstract
Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa F(ST) were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.
Collapse
Affiliation(s)
- John E Pool
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
A coalescent model of background selection with recombination, demography and variation in selection coefficients. Heredity (Edinb) 2012. [PMID: 23188176 DOI: 10.1038/hdy.2012.102] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
There is increasing evidence that background selection, the effects of the elimination of recurring deleterious mutations by natural selection on variability at linked sites, may be a major factor shaping genome-wide patterns of genetic diversity. To accurately quantify the importance of background selection, it is vital to have computationally efficient models that include essential biological features. To this end, a structured coalescent procedure is used to construct a model of background selection that takes into account the effects of recombination, recent changes in population size and variation in selection coefficients against deleterious mutations across sites. Furthermore, this model allows a flexible organization of selected and neutral sites in the region concerned, and has the ability to generate sequence variability at both selected and neutral sites, allowing the correlation between these two types of sites to be studied. The accuracy of the model is verified by checking against the results of forward simulations. These simulations also reveal several patterns of diversity that are in qualitative agreement with observations reported in recent studies of DNA sequence polymorphisms. These results suggest that the model should be useful for data analysis.
Collapse
|
40
|
Abstract
In the last few years, two paradigms underlying human evolution have crumbled. Modern humans have not totally replaced previous hominins without any admixture, and the expected signatures of adaptations to new environments are surprisingly lacking at the genomic level. Here we review current evidence about archaic admixture and lack of strong selective sweeps in humans. We underline the need to properly model differential admixture in various populations to correctly reconstruct past demography. We also stress the importance of taking into account the spatial dimension of human evolution, which proceeded by a series of range expansions that could have promoted both the introgression of archaic genes and background selection.
Collapse
Affiliation(s)
- Isabel Alves
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Population and Conservation Genetics Group, Instituto Gulbenkian de Ciência, Oeiras, Portugal
| | - Anna Šrámková Hanulová
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Matthieu Foll
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Laurent Excoffier
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
41
|
Impact of sampling schemes on demographic inference: an empirical study in two species with different mating systems and demographic histories. G3-GENES GENOMES GENETICS 2012; 2:803-14. [PMID: 22870403 PMCID: PMC3385986 DOI: 10.1534/g3.112.002410] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Accepted: 05/10/2012] [Indexed: 12/12/2022]
Abstract
Most species have at least some level of genetic structure. Recent simulation studies have shown that it is important to consider population structure when sampling individuals to infer past population history. The relevance of the results of these computer simulations for empirical studies, however, remains unclear. In the present study, we use DNA sequence datasets collected from two closely related species with very different histories, the selfing species Capsella rubella and its outcrossing relative C. grandiflora, to assess the impact of different sampling strategies on summary statistics and the inference of historical demography. Sampling strategy did not strongly influence the mean values of Tajima's D in either species, but it had some impact on the variance. The general conclusions about demographic history were comparable across sampling schemes even when resampled data were analyzed with approximate Bayesian computation (ABC). We used simulations to explore the effects of sampling scheme under different demographic models. We conclude that when sequences from modest numbers of loci (<60) are analyzed, the sampling strategy is generally of limited importance. The same is true under intermediate or high levels of gene flow (4Nm > 2-10) in models in which global expansion is combined with either local expansion or hierarchical population structure. Although we observe a less severe effect of sampling than predicted under some earlier simulation models, our results should not be seen as an encouragement to neglect this issue. In general, a good coverage of the natural range, both within and between populations, will be needed to obtain a reliable reconstruction of a species's demographic history, and in fact, the effect of sampling scheme on polymorphism patterns may itself provide important information about demographic history.
Collapse
|
42
|
The role of background selection in shaping patterns of molecular evolution and variation: evidence from variability on the Drosophila X chromosome. Genetics 2012; 191:233-46. [PMID: 22377629 DOI: 10.1534/genetics.111.138073] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In the putatively ancestral population of Drosophila melanogaster, the ratio of silent DNA sequence diversity for X-linked loci to that for autosomal loci is approximately one, instead of the expected "null" value of 3/4. One possible explanation is that background selection (the hitchhiking effect of deleterious mutations) is more effective on the autosomes than on the X chromosome, because of the lack of crossing over in male Drosophila. The expected effects of background selection on neutral variability at sites in the middle of an X chromosome or an autosomal arm were calculated for different models of chromosome organization and methods of approximation, using current estimates of the deleterious mutation rate and distributions of the fitness effects of deleterious mutations. The robustness of the results to different distributions of fitness effects, dominance coefficients, mutation rates, mapping functions, and chromosome size was investigated. The predicted ratio of X-linked to autosomal variability is relatively insensitive to these variables, except for the mutation rate and map length. Provided that the deleterious mutation rate per genome is sufficiently large, it seems likely that background selection can account for the observed X to autosome ratio of variability in the ancestral population of D. melanogaster. The fact that this ratio is much less than one in D. pseudoobscura is also consistent with the model's predictions, since this species has a high rate of crossing over. The results suggest that background selection may play a major role in shaping patterns of molecular evolution and variation.
Collapse
|
43
|
Charlesworth B. The effects of deleterious mutations on evolution at linked sites. Genetics 2012; 190:5-22. [PMID: 22219506 PMCID: PMC3249359 DOI: 10.1534/genetics.111.134288] [Citation(s) in RCA: 215] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 11/04/2011] [Indexed: 01/14/2023] Open
Abstract
The process of evolution at a given site in the genome can be influenced by the action of selection at other sites, especially when these are closely linked to it. Such selection reduces the effective population size experienced by the site in question (the Hill-Robertson effect), reducing the level of variability and the efficacy of selection. In particular, deleterious variants are continually being produced by mutation and then eliminated by selection at sites throughout the genome. The resulting reduction in variability at linked neutral or nearly neutral sites can be predicted from the theory of background selection, which assumes that deleterious mutations have such large effects that their behavior in the population is effectively deterministic. More weakly selected mutations can accumulate by Muller's ratchet after a shutdown of recombination, as in an evolving Y chromosome. Many functionally significant sites are probably so weakly selected that Hill-Robertson interference undermines the effective strength of selection upon them, when recombination is rare or absent. This leads to large departures from deterministic equilibrium and smaller effects on linked neutral sites than under background selection or Muller's ratchet. Evidence is discussed that is consistent with the action of these processes in shaping genome-wide patterns of variation and evolution.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| |
Collapse
|
44
|
The structure of genealogies in the presence of purifying selection: a fitness-class coalescent. Genetics 2011; 190:753-79. [PMID: 22135349 DOI: 10.1534/genetics.111.134544] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Compared to a neutral model, purifying selection distorts the structure of genealogies and hence alters the patterns of sampled genetic variation. Although these distortions may be common in nature, our understanding of how we expect purifying selection to affect patterns of molecular variation remains incomplete. Genealogical approaches such as coalescent theory have proven difficult to generalize to situations involving selection at many linked sites, unless selection pressures are extremely strong. Here, we introduce an effective coalescent theory (a "fitness-class coalescent") to describe the structure of genealogies in the presence of purifying selection at many linked sites. We use this effective theory to calculate several simple statistics describing the expected patterns of variation in sequence data, both at the sites under selection and at linked neutral sites. Our analysis combines a description of the allele frequency spectrum in the presence of purifying selection with the structured coalescent approach of Kaplan et al. (1988), to trace the ancestry of individuals through the distribution of fitnesses within the population. We also derive our results using a more direct extension of the structured coalescent approach of Hudson and Kaplan (1994). We find that purifying selection leads to patterns of genetic variation that are related but not identical to a neutrally evolving population in which population size has varied in a specific way in the past.
Collapse
|