1
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. Evolution 2023; 77:2113-2127. [PMID: 37395482 PMCID: PMC10547124 DOI: 10.1093/evolut/qpad120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/15/2023] [Accepted: 06/30/2023] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
2
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.15.545166. [PMID: 37398347 PMCID: PMC10312679 DOI: 10.1101/2023.06.15.545166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modelled by a realistic mutation rate and as part of a realistic distribution of fitness effects (DFE), as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modelled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false positive rates are in excess of true positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong. Teaser Text Outlier-based genomic scans have proven a popular approach for identifying loci that have potentially experienced recent positive selection. However, it has previously been shown that an evolutionarily appropriate baseline model that incorporates non-equilibrium population histories, purifying and background selection, and variation in mutation and recombination rates is necessary to reduce often extreme false positive rates when performing genomic scans. Here we evaluate the power to detect recurrent selective sweeps using common SFS-based and haplotype-based methods under these increasingly realistic models. We find that while these appropriate evolutionary baselines are essential to reduce false positive rates, the power to accurately detect recurrent selective sweeps is generally low across much of the biologically relevant parameter space.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Present address: Department of Biology, Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
3
|
Jensen JD. Population genetic concerns related to the interpretation of empirical outliers and the neglect of common evolutionary processes. Heredity (Edinb) 2023; 130:109-110. [PMID: 36829044 PMCID: PMC9981695 DOI: 10.1038/s41437-022-00575-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 02/26/2023] Open
Affiliation(s)
- Jeffrey D Jensen
- School of Life Science, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
4
|
Laval G, Patin E, Boutillier P, Quintana-Murci L. Sporadic occurrence of recent selective sweeps from standing variation in humans as revealed by an approximate Bayesian computation approach. Genetics 2021; 219:6377789. [PMID: 34849862 DOI: 10.1093/genetics/iyab161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 09/01/2021] [Indexed: 12/14/2022] Open
Abstract
During their dispersals over the last 100,000 years, modern humans have been exposed to a large variety of environments, resulting in genetic adaptation. While genome-wide scans for the footprints of positive Darwinian selection have increased knowledge of genes and functions potentially involved in human local adaptation, they have globally produced evidence of a limited contribution of selective sweeps in humans. Conversely, studies based on machine learning algorithms suggest that recent sweeps from standing variation are widespread in humans, an observation that has been recently questioned. Here, we sought to formally quantify the number of recent selective sweeps in humans, by leveraging approximate Bayesian computation and whole-genome sequence data. Our computer simulations revealed suitable ABC estimations, regardless of the frequency of the selected alleles at the onset of selection and the completion of sweeps. Under a model of recent selection from standing variation, we inferred that an average of 68 (from 56 to 79) and 140 (from 94 to 198) sweeps occurred over the last 100,000 years of human history, in African and Eurasian populations, respectively. The former estimation is compatible with human adaptation rates estimated since divergence with chimps, and reveals numbers of sweeps per generation per site in the range of values estimated in Drosophila. Our results confirm the rarity of selective sweeps in humans and show a low contribution of sweeps from standing variation to recent human adaptation.
Collapse
Affiliation(s)
- Guillaume Laval
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Pierre Boutillier
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France.,Human Genomics and Evolution, Collège de France, 75005 Paris, France
| |
Collapse
|
5
|
Johri P, Charlesworth B, Howell EK, Lynch M, Jensen JD. Revisiting the notion of deleterious sweeps. Genetics 2021; 219:iyab094. [PMID: 34125884 PMCID: PMC9101445 DOI: 10.1093/genetics/iyab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022] Open
Abstract
It has previously been shown that, conditional on its fixation, the time to fixation of a semi-dominant deleterious autosomal mutation in a randomly mating population is the same as that of an advantageous mutation. This result implies that deleterious mutations could generate selective sweep-like effects. Although their fixation probabilities greatly differ, the much larger input of deleterious relative to beneficial mutations suggests that this phenomenon could be important. We here examine how the fixation of mildly deleterious mutations affects levels and patterns of polymorphism at linked sites-both in the presence and absence of interference amongst deleterious mutations-and how this class of sites may contribute to divergence between-populations and species. We find that, while deleterious fixations are unlikely to represent a significant proportion of outliers in polymorphism-based genomic scans within populations, minor shifts in the frequencies of deleterious mutations can influence the proportions of private variants and the value of FST after a recent population split. As sites subject to deleterious mutations are necessarily found in functional genomic regions, interpretations in terms of recurrent positive selection may require reconsideration.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Emma K Howell
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
- Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
6
|
Gulisija D, Kim Y. Emergence of long-term balanced polymorphism under cyclic selection of spatially variable magnitude. Evolution 2015; 69:979-92. [PMID: 25707330 DOI: 10.1111/evo.12630] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 02/15/2015] [Indexed: 01/09/2023]
Abstract
A fundamental question in evolutionary biology is what promotes genetic variation at nonneutral loci, a major precursor to adaptation in changing environments. In particular, balanced polymorphism under realistic evolutionary models of temporally varying environments in finite natural populations remains to be demonstrated. Here, we propose a novel mechanism of balancing selection under temporally varying fitnesses. Using forward-in-time computer simulations and mathematical analysis, we show that cyclic selection that spatially varies in magnitude, such as along an environmental gradient, can lead to elevated levels of nonneutral genetic polymorphism in finite populations. Balanced polymorphism is more likely with an increase in gene flow, magnitude and period of fitness oscillations, and spatial heterogeneity. This polymorphism-promoting effect is robust to small systematic fitness differences between competing alleles or to random environmental perturbation. Furthermore, we demonstrate analytically that protected polymorphism arises as spatially heterogeneous cyclic fitness oscillations generate a type of storage effect that leads to negative frequency dependent selection. Our findings imply that spatially variable cyclic environments can promote elevated levels of nonneutral genetic variation in natural populations.
Collapse
Affiliation(s)
- Davorka Gulisija
- Department of Zoology, University of Wisconsin, Madison, Wisconsin, 53706; Current Address: Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, 19104
| | | |
Collapse
|
7
|
Schneider A, Charlesworth B, Eyre-Walker A, Keightley PD. A method for inferring the rate of occurrence and fitness effects of advantageous mutations. Genetics 2011; 189:1427-37. [PMID: 21954160 PMCID: PMC3241409 DOI: 10.1534/genetics.111.131730] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 09/24/2011] [Indexed: 11/18/2022] Open
Abstract
The distribution of fitness effects (DFE) of new mutations is of fundamental importance in evolutionary genetics. Recently, methods have been developed for inferring the DFE that use information from the allele frequency distributions of putatively neutral and selected nucleotide polymorphic variants in a population sample. Here, we extend an existing maximum-likelihood method that estimates the DFE under the assumption that mutational effects are unconditionally deleterious, by including a fraction of positively selected mutations. We allow one or more classes of positive selection coefficients in the model and estimate both the fraction of mutations that are advantageous and the strength of selection acting on them. We show by simulations that the method is capable of recovering the parameters of the DFE under a range of conditions. We apply the method to two data sets on multiple protein-coding genes from African populations of Drosophila melanogaster. We use a probabilistic reconstruction of the ancestral states of the polymorphic sites to distinguish between derived and ancestral states at polymorphic nucleotide sites. In both data sets, we see a significant improvement in the fit when a category of positively selected amino acid mutations is included, but no further improvement if additional categories are added. We estimate that between 1% and 2% of new nonsynonymous mutations in D. melanogaster are positively selected, with a scaled selection coefficient representing the product of the effective population size, N(e), and the strength of selection on heterozygous carriers of ∼2.5.
Collapse
Affiliation(s)
- Adrian Schneider
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
| | - Peter D. Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| |
Collapse
|
8
|
Jensen JD, Bachtrog D. Characterizing the influence of effective population size on the rate of adaptation: Gillespie's Darwin domain. Genome Biol Evol 2011; 3:687-701. [PMID: 21705473 PMCID: PMC3157839 DOI: 10.1093/gbe/evr063] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Characterizing the role of effective population size in dictating the rate of adaptive evolution remains a major challenge in evolutionary biology. Depending on the underlying distribution of fitness effects of new mutations, populations of different sizes may differ vastly in their rate of adaptation. Here, we collect polymorphism data at over 100 loci for two closely related Drosophila species with different current effective population sizes (Ne), Drosophila miranda and D. pseudoobscura, to evaluate the prevalence of adaptive evolution versus genetic drift in molecular evolution. Utilizing these large and consistently sampled data sets, we obtain greatly improved estimates of the demographic histories of both species. Specifically, although current Ne differs between these species, their ancestral sizes were much more similar. We find that statistical approaches capturing recent adaptive evolution (using patterns of polymorphisms) detect higher rates of adaptive evolution in the larger D. pseudoobscura population. In contrast, methods aimed at detecting selection over longer time periods (i.e., those relying on divergence data) estimate more similar rates of adaptation between the two species. Thus, our results suggest an important role of effective population size in dictating rates of adaptation and highlight how complicated population histories—as is probably the case for most species—can effect rates of adaptation. Additionally, we also show how different methodologies to detect positive selection can reveal information about different timescales of adaptive evolution.
Collapse
Affiliation(s)
- Jeffrey D Jensen
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
| | | |
Collapse
|
9
|
Abstract
Perennial plants monitor seasonal changes through changes in environmental conditions such as the quantity and quality of light and genes in the photoperiodic pathway are known to be involved in controlling these processes. Here, we examine 25 of genes from the photoperiod pathway in Populus tremula (Salicaceae) for signatures of adaptive evolution. Overall, levels of synonymous polymorphism in the 25 genes are lower than at control loci selected randomly from the genome. This appears primarily to be caused by lower levels of synonymous polymorphism in genes associated with the circadian clock. Natural selection appears to play an important role in shaping protein evolution at several of the genes in the photoperiod pathways, which is highlighted by the fact that approximately 40% of the genes from the photoperiod pathway have estimates of selection on nonsynonymous polymorphisms that are significantly different from zero. A surprising observation we make is that circadian clock-associated genes appear to be over-represented among the genes showing elevated rates of protein evolution; seven genes are evolving under positive selection and all but one of these genes are involved in the circadian clock of Populus.
Collapse
Affiliation(s)
- David Hall
- Umeå Plant Science Centre, Department of Ecology and Environmental Science, Umeå University, Umeå, Sweden
| | | | | |
Collapse
|