1
|
Wang Y, Allen SL, Reddiex AJ, Chenoweth SF. The impacts of positive selection on genomic variation in Drosophila serrata: Insights from a deep learning approach. Mol Ecol 2024; 33:e17499. [PMID: 39188068 DOI: 10.1111/mec.17499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 07/22/2024] [Accepted: 08/07/2024] [Indexed: 08/28/2024]
Abstract
This study explores the impact of positive selection on the genetic composition of a Drosophila serrata population in eastern Australia through a comprehensive analysis of 110 whole genome sequences. Utilizing an advanced deep learning algorithm (partialS/HIC) and a range of inferred demographic histories, we identified that approximately 14% of the genome is directly affected by sweeps, with soft sweeps being more prevalent (10.6%) than hard sweeps (2.1%), and partial sweeps being uncommon (1.3%). The algorithm demonstrated robustness to demographic assumptions in classifying complete sweeps but faced challenges in distinguishing neutral regions from partial sweeps and linked regions under demographic misspecification. The findings reveal the indirect influence of sweeps on nearly two-thirds of the genome through linkage, with an over-representation of putatively deleterious variants suggesting that positive selection drags deleterious variants to higher frequency due to hitchhiking with beneficial loci. Gene ontology enrichment analysis further supported our confidence in the accuracy of sweep detection as several traits expected to be under positive selection due to evolutionary arms races (e.g. immunity) were detected in hard sweeps. This study provides valuable insights into the direct and indirect contributions of positive selection in shaping genomic variation in natural populations.
Collapse
Affiliation(s)
- Yiguan Wang
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Scott L Allen
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
| | - Adam J Reddiex
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
- Biological Data Science Institute, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Stephen F Chenoweth
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
| |
Collapse
|
2
|
Schrider DR. Allelic gene conversion softens selective sweeps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.05.570141. [PMID: 38106127 PMCID: PMC10723294 DOI: 10.1101/2023.12.05.570141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The prominence of positive selection, in which beneficial mutations are favored by natural selection and rapidly increase in frequency, is a subject of intense debate. Positive selection can result in selective sweeps, in which the haplotype(s) bearing the adaptive allele "sweep" through the population, thereby removing much of the genetic diversity from the region surrounding the target of selection. Two models of selective sweeps have been proposed: classical sweeps, or "hard sweeps", in which a single copy of the adaptive allele sweeps to fixation, and "soft sweeps", in which multiple distinct copies of the adaptive allele leave descendants after the sweep. Soft sweeps can be the outcome of recurrent mutation to the adaptive allele, or the presence of standing genetic variation consisting of multiple copies of the adaptive allele prior to the onset of selection. Importantly, soft sweeps will be common when populations can rapidly adapt to novel selective pressures, either because of a high mutation rate or because adaptive alleles are already present. The prevalence of soft sweeps is especially controversial, and it has been noted that selection on standing variation or recurrent mutations may not always produce soft sweeps. Here, we show that the inverse is true: selection on single-origin de novo mutations may often result in an outcome that is indistinguishable from a soft sweep. This is made possible by allelic gene conversion, which "softens" hard sweeps by copying the adaptive allele onto multiple genetic backgrounds, a process we refer to as a "pseudo-soft" sweep. We carried out a simulation study examining the impact of gene conversion on sweeps from a single de novo variant in models of human, Drosophila, and Arabidopsis populations. The fraction of simulations in which gene conversion had produced multiple haplotypes with the adaptive allele upon fixation was appreciable. Indeed, under realistic demographic histories and gene conversion rates, even if selection always acts on a single-origin mutation, sweeps involving multiple haplotypes are more likely than hard sweeps in large populations, especially when selection is not extremely strong. Thus, even when the mutation rate is low or there is no standing variation, hard sweeps are expected to be the exception rather than the rule in large populations. These results also imply that the presence of signatures of soft sweeps does not necessarily mean that adaptation has been especially rapid or is not mutation limited.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
3
|
Iwasaki RL, Satta Y. Spatial and temporal diversity of positive selection on shared haplotypes at the PSCA locus among worldwide human populations. Heredity (Edinb) 2023; 131:156-169. [PMID: 37353592 PMCID: PMC10382566 DOI: 10.1038/s41437-023-00631-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 05/16/2023] [Accepted: 05/17/2023] [Indexed: 06/25/2023] Open
Abstract
Selection on standing genetic variation is important for rapid local genetic adaptation when the environment changes. We report that, for the prostate stem cell antigen (PSCA) gene, different populations have different target haplotypes, even though haplotypes are shared among populations. The C-C-A haplotype, whereby the first C is located at rs2294008 of PSCA and is a low risk allele for gastric cancer, has become a target of positive selection in Asia. Conversely, the C-A-G haplotype carrying the same C allele has become a selection target mainly in Africa. However, Asian and African share both haplotypes, consistent with the haplotype divergence time (170 kya) prior to the out-of-Africa dispersal. The frequency of C-C-A/C-A-G is 0.344/0.278 in Asia and 0.209/0.416 in Africa. Two-dimensional site frequency spectrum analysis revealed that the extent of intra-allelic variability of the target haplotype is extremely small in each local population, suggesting that C-C-A or C-A-G is under ongoing hard sweeps in local populations. From the time to the most recent common ancestor (TMRCA) of selected haplotypes, the onset times of positive selection were recent (3-55 kya), concurrently with population subdivision from a common ancestor. Additionally, estimated selection coefficients from ABC analysis were up to ~3%, similar to those at other loci under recent positive selection. Phylogeny of local populations and TMRCA of selected haplotypes revealed that spatial and temporal switching of positive selection targets is a unique and novel feature of ongoing selection at PSCA. This switching may reflect the potential of rapid adaptability to distinct environments.
Collapse
Affiliation(s)
- Risa L Iwasaki
- Department of Evolutionary Studies of Biosystems, School of Advanced Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, 240-0193, Japan
- Research Center for Integrative Evolutionary Science, SOKENDAI, Hayama, Kanagawa, 240-0193, Japan
| | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, School of Advanced Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, 240-0193, Japan.
- Research Center for Integrative Evolutionary Science, SOKENDAI, Hayama, Kanagawa, 240-0193, Japan.
| |
Collapse
|
4
|
The nonlinear structure of linkage disequilibrium. Theor Popul Biol 2020; 134:160-170. [PMID: 32222435 DOI: 10.1016/j.tpb.2020.02.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 02/15/2020] [Accepted: 02/27/2020] [Indexed: 11/23/2022]
Abstract
The allele frequency dependence of the ranges of all measures of linkage disequilibrium is well-known. The maximum values of commonly used parameters such as r2 and D vary depending on the allele frequencies at each locus. However, though this phenomenon is recognized and accounted for in many studies, the comprehensive mathematical framework underlying the limits of linkage disequilibrium measures at various frequency combinations is often heuristic or empirical. Here, it is demonstrated that underlying this behavior is the fundamental shift between linear and nonlinear dependence in the linkage disequilibrium structure between loci. The proportion of linear and nonlinear dependence can be estimated and it demonstrates how even the same values of r2 can have different implications for the nature of the overall dependence. One result of this is the value of D', when defined as only a positive number, has a minimum value of |r|. Understanding this dependence is crucial to making correct inferences about the relationships between two loci in linkage disequilibrium.
Collapse
|
5
|
Abstract
The degree to which adaptation in recent human evolution shapes genetic variation remains controversial. This is in part due to the limited evidence in humans for classic "hard selective sweeps", wherein a novel beneficial mutation rapidly sweeps through a population to fixation. However, positive selection may often proceed via "soft sweeps" acting on mutations already present within a population. Here, we examine recent positive selection across six human populations using a powerful machine learning approach that is sensitive to both hard and soft sweeps. We found evidence that soft sweeps are widespread and account for the vast majority of recent human adaptation. Surprisingly, our results also suggest that linked positive selection affects patterns of variation across much of the genome, and may increase the frequencies of deleterious mutations. Our results also reveal insights into the role of sexual selection, cancer risk, and central nervous system development in recent human evolution.
Collapse
Affiliation(s)
- Daniel R. Schrider
- Department of Genetics, Rutgers University, Piscataway, NJ
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ
| | - Andrew D. Kern
- Department of Genetics, Rutgers University, Piscataway, NJ
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ
| |
Collapse
|
6
|
Hermisson J, Pennings PS. Soft sweeps and beyond: understanding the patterns and probabilities of selection footprints under rapid adaptation. Methods Ecol Evol 2017. [DOI: 10.1111/2041-210x.12808] [Citation(s) in RCA: 186] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Joachim Hermisson
- Department of Mathematics and Max F. Perutz Laboratories University of Vienna Vienna Austria
| | - Pleuni S. Pennings
- Department of Biology San Francisco State University San Francisco CA USA
| |
Collapse
|
7
|
Korunes KL, Noor MAF. Gene conversion and linkage: effects on genome evolution and speciation. Mol Ecol 2016; 26:351-364. [DOI: 10.1111/mec.13736] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Revised: 06/07/2016] [Accepted: 06/22/2016] [Indexed: 12/12/2022]
|
8
|
Soft shoulders ahead: spurious signatures of soft and partial selective sweeps result from linked hard sweeps. Genetics 2015; 200:267-84. [PMID: 25716978 DOI: 10.1534/genetics.115.174912] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 02/20/2015] [Indexed: 11/18/2022] Open
Abstract
Characterizing the nature of the adaptive process at the genetic level is a central goal for population genetics. In particular, we know little about the sources of adaptive substitution or about the number of adaptive variants currently segregating in nature. Historically, population geneticists have focused attention on the hard-sweep model of adaptation in which a de novo beneficial mutation arises and rapidly fixes in a population. Recently more attention has been given to soft-sweep models, in which alleles that were previously neutral, or nearly so, drift until such a time as the environment shifts and their selection coefficient changes to become beneficial. It remains an active and difficult problem, however, to tease apart the telltale signatures of hard vs. soft sweeps in genomic polymorphism data. Through extensive simulations of hard- and soft-sweep models, here we show that indeed the two might not be separable through the use of simple summary statistics. In particular, it seems that recombination in regions linked to, but distant from, sites of hard sweeps can create patterns of polymorphism that closely mirror what is expected to be found near soft sweeps. We find that a very similar situation arises when using haplotype-based statistics that are aimed at detecting partial or ongoing selective sweeps, such that it is difficult to distinguish the shoulder of a hard sweep from the center of a partial sweep. While knowing the location of the selected site mitigates this problem slightly, we show that stochasticity in signatures of natural selection will frequently cause the signal to reach its zenith far from this site and that this effect is more severe for soft sweeps; thus inferences of the target as well as the mode of positive selection may be inaccurate. In addition, both the time since a sweep ends and biologically realistic levels of allelic gene conversion lead to errors in the classification and identification of selective sweeps. This general problem of "soft shoulders" underscores the difficulty in differentiating soft and partial sweeps from hard-sweep scenarios in molecular population genomics data. The soft-shoulder effect also implies that the more common hard sweeps have been in recent evolutionary history, the more prevalent spurious signatures of soft or partial sweeps may appear in some genome-wide scans.
Collapse
|
9
|
Joost S, Vuilleumier S, Jensen JD, Schoville S, Leempoel K, Stucki S, Widmer I, Melodelima C, Rolland J, Manel S. Uncovering the genetic basis of adaptive change: on the intersection of landscape genomics and theoretical population genetics. Mol Ecol 2013; 22:3659-65. [DOI: 10.1111/mec.12352] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Stéphane Joost
- Laboratory of Geographic Information Systems (LASIG); School of Civil and Environmental Engineering (ENAC); École Polytechnique Fédérale de Lausanne (EPFL); Bâtiment GC Station 18 1015 Lausanne Switzerland
| | - Séverine Vuilleumier
- Department of Ecology and Evolution; University of Lausanne; Biophore Building 1015 Lausanne Switzerland
| | - Jeffrey D. Jensen
- Institute of Bioengineering, School of Life Sciences; École Polytechnique Fédérale de Lausanne (EPFL); Lausanne Switzerland
- Swiss Institute of Bioinformatics; 1015 Lausanne Switzerland
| | - Sean Schoville
- CNRS, TIMC-IMAG UMR 5525; Université Joseph Fourier; 38041 Grenoble France
| | - Kevin Leempoel
- Laboratory of Geographic Information Systems (LASIG); School of Civil and Environmental Engineering (ENAC); École Polytechnique Fédérale de Lausanne (EPFL); Bâtiment GC Station 18 1015 Lausanne Switzerland
| | - Sylvie Stucki
- Laboratory of Geographic Information Systems (LASIG); School of Civil and Environmental Engineering (ENAC); École Polytechnique Fédérale de Lausanne (EPFL); Bâtiment GC Station 18 1015 Lausanne Switzerland
| | - Ivo Widmer
- Laboratory of Geographic Information Systems (LASIG); School of Civil and Environmental Engineering (ENAC); École Polytechnique Fédérale de Lausanne (EPFL); Bâtiment GC Station 18 1015 Lausanne Switzerland
| | - Christelle Melodelima
- Laboratoire d'Ecologie Alpine; UMR-CNRS 5553; Université Joseph Fourier; 38041 Grenoble France
| | - Jonathan Rolland
- Centre de mathématiques appliquées; Ecole Polytechnique; 91128 Palaiseau Cedex France
| | - Stéphanie Manel
- Laboratoire Population Environnement Développement; UMR 151 UP/IRD; Université Aix Marseille; 3 place Victor Hugo 13331 Marseille Cedex 03 France
- UMR BotAnique et BioinforMatique de l'Architecture des Plantes (AMAP); TA A51/PS2 34398 Montpellier Cedex 5 France
| |
Collapse
|
10
|
Crisci JL, Poh YP, Bean A, Simkin A, Jensen JD. Recent progress in polymorphism-based population genetic inference. ACTA ACUST UNITED AC 2012; 103:287-96. [PMID: 22246406 DOI: 10.1093/jhered/esr128] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The recent availability of whole-genome sequencing data affords tremendous power for statistical inference. With this, there has been great interest in the development of polymorphism-based approaches for the estimation of population genetic parameters. These approaches seek to estimate, for example, recently fixed or sweeping beneficial mutations, the rate of recurrent positive selection, the distribution of selection coefficients, and the demographic history of the population. Yet despite estimating similar parameters using similar data sets, results between methodologies are far from consistent. We here summarize the current state of the field, compare existing approaches, and attempt to reconcile emerging discrepancies. We also discuss the biases in selection estimators introduced by ignoring the demographic history of the population, discuss the biases in demographic estimators introduced by assuming neutrality, and highlight the important challenge to the field of achieving a true joint estimation procedure to circumvent these confounding effects.
Collapse
Affiliation(s)
- Jessica L Crisci
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | | | | | | | | |
Collapse
|
11
|
Ptak SE, Enard W, Wiebe V, Hellmann I, Krause J, Lachmann M, Pääbo S. Linkage disequilibrium extends across putative selected sites in FOXP2. Mol Biol Evol 2009; 26:2181-4. [PMID: 19608635 DOI: 10.1093/molbev/msp143] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Polymorphism data in humans suggest that the gene encoding the transcription factor FOXP2, which influences speech and language development, has been subject to a selective sweep within the last 260,000 years. It has been proposed that one or both of two substitutions that occurred on the human evolutionary lineage and changed amino acids were the targets for selection. In apparent contradiction to this is the observation that these substitutions are present in Neandertals who diverged from humans maybe 300,000-400,000 years ago. We have collected polymorphism data upstream and downstream of the substitutions. Contrary to what is expected, following a selective sweep, we find that the haplotypes extend across the two sites. We discuss possible explanations for these observations. One of them is that the selective sweep reflected in FOXP2 polymorphism data was not associated with the two amino acid substitutions.
Collapse
|