1
|
Soni V, Jensen JD. Temporal challenges in detecting balancing selection from population genomic data. G3 (BETHESDA, MD.) 2024; 14:jkae069. [PMID: 38551137 DOI: 10.1093/g3journal/jkae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 12/21/2023] [Accepted: 03/19/2024] [Indexed: 04/28/2024]
Abstract
The role of balancing selection in maintaining genetic variation remains an open question in population genetics. Recent years have seen numerous studies identifying candidate loci potentially experiencing balancing selection, most predominantly in human populations. There are however numerous alternative evolutionary processes that may leave similar patterns of variation, thereby potentially confounding inference, and the expected signatures of balancing selection additionally change in a temporal fashion. Here we use forward-in-time simulations to quantify expected statistical power to detect balancing selection using both site frequency spectrum- and linkage disequilibrium-based methods under a variety of evolutionarily realistic null models. We find that whilst site frequency spectrum-based methods have little power immediately after a balanced mutation begins segregating, power increases with time since the introduction of the balanced allele. Conversely, linkage disequilibrium-based methods have considerable power whilst the allele is young, and power dissipates rapidly as the time since introduction increases. Taken together, this suggests that site frequency spectrum-based methods are most effective at detecting long-term balancing selection (>25N generations since the introduction of the balanced allele) whilst linkage disequilibrium-based methods are effective over much shorter timescales (<1N generations), thereby leaving a large time frame over which current methods have little power to detect the action of balancing selection. Finally, we investigate the extent to which alternative evolutionary processes may mimic these patterns, and demonstrate the need for caution in attempting to distinguish the signatures of balancing selection from those of both neutral processes (e.g. population structure and admixture) as well as of alternative selective processes (e.g. partial selective sweeps).
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
2
|
Schlichta F, Moinet A, Peischl S, Excoffier L. The Impact of Genetic Surfing on Neutral Genomic Diversity. Mol Biol Evol 2022; 39:msac249. [PMID: 36403964 PMCID: PMC9703594 DOI: 10.1093/molbev/msac249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Range expansions have been common in the history of most species. Serial founder effects and subsequent population growth at expansion fronts typically lead to a loss of genomic diversity along the expansion axis. A frequent consequence is the phenomenon of "gene surfing," where variants located near the expanding front can reach high frequencies or even fix in newly colonized territories. Although gene surfing events have been characterized thoroughly for a specific locus, their effects on linked genomic regions and the overall patterns of genomic diversity have been little investigated. In this study, we simulated the evolution of whole genomes during several types of 1D and 2D range expansions differing by the extent of migration, founder events, and recombination rates. We focused on the characterization of local dips of diversity, or "troughs," taken as a proxy for surfing events. We find that, for a given recombination rate, once we consider the amount of diversity lost since the beginning of the expansion, it is possible to predict the initial evolution of trough density and their average width irrespective of the expansion condition. Furthermore, when recombination rates vary across the genome, we find that troughs are over-represented in regions of low recombination. Therefore, range expansions can leave local and global genomic signatures often interpreted as evidence of past selective events. Given the generality of our results, they could be used as a null model for species having gone through recent expansions, and thus be helpful to correctly interpret many evolutionary biology studies.
Collapse
Affiliation(s)
- Flávia Schlichta
- Computational and Molecular Population Genetics lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Antoine Moinet
- Computational and Molecular Population Genetics lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Interfaculty Bioinformatics Unit, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Stephan Peischl
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Interfaculty Bioinformatics Unit, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Laurent Excoffier
- Computational and Molecular Population Genetics lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
3
|
De-la-Cruz IM, Batsleer F, Bonte D, Diller C, Hytönen T, Muola A, Osorio S, Posé D, Vandegehuchte ML, Stenberg JA. Evolutionary Ecology of Plant-Arthropod Interactions in Light of the "Omics" Sciences: A Broad Guide. FRONTIERS IN PLANT SCIENCE 2022; 13:808427. [PMID: 35548276 PMCID: PMC9084618 DOI: 10.3389/fpls.2022.808427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 04/01/2022] [Indexed: 06/15/2023]
Abstract
Aboveground plant-arthropod interactions are typically complex, involving herbivores, predators, pollinators, and various other guilds that can strongly affect plant fitness, directly or indirectly, and individually, synergistically, or antagonistically. However, little is known about how ongoing natural selection by these interacting guilds shapes the evolution of plants, i.e., how they affect the differential survival and reproduction of genotypes due to differences in phenotypes in an environment. Recent technological advances, including next-generation sequencing, metabolomics, and gene-editing technologies along with traditional experimental approaches (e.g., quantitative genetics experiments), have enabled far more comprehensive exploration of the genes and traits involved in complex ecological interactions. Connecting different levels of biological organization (genes to communities) will enhance the understanding of evolutionary interactions in complex communities, but this requires a multidisciplinary approach. Here, we review traditional and modern methods and concepts, then highlight future avenues for studying the evolution of plant-arthropod interactions (e.g., plant-herbivore-pollinator interactions). Besides promoting a fundamental understanding of plant-associated arthropod communities' genetic background and evolution, such knowledge can also help address many current global environmental challenges.
Collapse
Affiliation(s)
- Ivan M. De-la-Cruz
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Femke Batsleer
- Terrestrial Ecology Unit, Department of Biology, Ghent University, Ghent, Belgium
| | - Dries Bonte
- Terrestrial Ecology Unit, Department of Biology, Ghent University, Ghent, Belgium
| | - Carolina Diller
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Timo Hytönen
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Helsinki, Finland
- NIAB EMR, West Malling, United Kingdom
| | - Anne Muola
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
- Biodiversity Unit, University of Turku, Finland
| | - Sonia Osorio
- Departamento de Biología Molecular y Bioquímica, Instituto de Hortofruticultura Subtropical y Mediterránea “La Mayora”, Universidad de Málaga-Consejo Superior de Investigaciones Científicas, Campus de Teatinos, Málaga, Spain
| | - David Posé
- Departamento de Biología Molecular y Bioquímica, Instituto de Hortofruticultura Subtropical y Mediterránea “La Mayora”, Universidad de Málaga-Consejo Superior de Investigaciones Científicas, Campus de Teatinos, Málaga, Spain
| | - Martijn L. Vandegehuchte
- Terrestrial Ecology Unit, Department of Biology, Ghent University, Ghent, Belgium
- Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway
| | - Johan A. Stenberg
- Department of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden
| |
Collapse
|
4
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software. Identifying regions of the genome that contain adaptive variation is of fundamental interest in evolutionary biology, providing insight into an organism’s history and biology. When positive selection is recent or ongoing, we expect to find genomic patterns such as high frequency haplotypes and low genetic diversity in the vicinity of the adaptive locus. Here we develop a statistic to identify these regions based on distortions of the haplotype frequency spectrum from a background distribution. We evaluate the performance of this statistic under numerous realistic settings of interest to empiricists and demonstrate its superior performance relative to other haplotype-based selection statistics. We also apply this statistic to real population-genetic data. As a positive control, we explore two well-studied loci, LCT and MHC, in a European and an African human population that show strong evidence for selection. We also apply this statistic to the genomes of an urban brown rat population, where we uncover evidence for adaptation in olfactory perception genes. We release user-friendly software implementing this statistic.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
- * E-mail: (MD); (ZAS)
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail: (MD); (ZAS)
| |
Collapse
|
5
|
Angst P, Ebert D, Fields PD. Demographic history shapes genomic variation in an intracellular parasite with a wide geographic distribution. Mol Ecol 2022; 31:2528-2544. [DOI: 10.1111/mec.16419] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 02/14/2022] [Accepted: 02/28/2022] [Indexed: 11/27/2022]
Affiliation(s)
- Pascal Angst
- Department of Environmental Sciences, Zoology University of Basel Vesalgasse 1 4051 Basel Switzerland
| | - Dieter Ebert
- Department of Environmental Sciences, Zoology University of Basel Vesalgasse 1 4051 Basel Switzerland
| | - Peter D. Fields
- Department of Environmental Sciences, Zoology University of Basel Vesalgasse 1 4051 Basel Switzerland
| |
Collapse
|
6
|
Vasilarou M, Alachiotis N, Garefalaki J, Beloukas A, Pavlidis P. Population Genomics Insights into the First Wave of COVID-19. Life (Basel) 2021; 11:129. [PMID: 33562321 PMCID: PMC7914631 DOI: 10.3390/life11020129] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/30/2021] [Accepted: 02/04/2021] [Indexed: 01/09/2023] Open
Abstract
Full-genome-sequence computational analyses of the SARS-coronavirus (CoV)-2 genomes allow us to understand the evolutionary events and adaptability mechanisms. We used population genetics analyses on human SARS-CoV-2 genomes available on 2 April 2020 to infer the mutation rate and plausible recombination events between the Betacoronavirus genomes in nonhuman hosts that may have contributed to the evolution of SARS-CoV-2. Furthermore, we localized the targets of recent and strong, positive selection during the first pandemic wave. The genomic regions that appear to be under positive selection are largely co-localized with regions in which recombination from nonhuman hosts took place. Our results suggest that the pangolin coronavirus genome may have contributed to the SARS-CoV-2 genome by recombination with the bat coronavirus genome. However, we find evidence for additional recombination events that involve coronavirus genomes from other hosts, i.e., hedgehogs and sparrows. We further infer that recombination may have recently occurred within human hosts. Finally, we estimate the parameters of a demographic scenario involving an exponential growth of the size of the SARS-CoV-2 populations that have infected European, Asian, and Northern American cohorts, and we demonstrate that a rapid exponential growth in population size from the first wave can support the observed polymorphism patterns in SARS-CoV-2 genomes.
Collapse
Affiliation(s)
- Maria Vasilarou
- Foundation for Research and Technology Hellas (FORTH) and Department of Biology, Institute of Molecular Biology and Biotechnology (IMBB), University of Crete, 70013 Crete, Greece;
| | | | - Joanna Garefalaki
- Institute of Computer Science (ICS), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece;
| | - Apostolos Beloukas
- Department of Biomedical Sciences, University of West Attica, 12243 Athens, Greece
- Institute of Infection and Global Health, University of Liverpool, Liverpool L69 7BE, UK
| | - Pavlos Pavlidis
- Institute of Computer Science (ICS), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece;
| |
Collapse
|
7
|
Nakagome S, Hudson RR, Di Rienzo A. Inferring the model and onset of natural selection under varying population size from the site frequency spectrum and haplotype structure. Proc Biol Sci 2020; 286:20182541. [PMID: 30963935 DOI: 10.1098/rspb.2018.2541] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
A fundamental question about adaptation in a population is the time of onset of the selective pressure acting on beneficial alleles. Inferring this time, in turn, depends on the selection model. We develop a framework of approximate Bayesian computation (ABC) that enables the use of the full site frequency spectrum and haplotype structure to test the goodness-of-fit of selection models and estimate the timing of selection under varying population size scenarios. We show that our method has sufficient power to distinguish natural selection from neutrality even if relatively old selection increased the frequency of a pre-existing allele from 20% to 50% or from 40% to 80%. Our ABC can accurately estimate the time of onset of selection on a new mutation. However, estimates are prone to bias under the standing variation model, possibly due to the uncertainty in the allele frequency at the onset of selection. We further extend our approach to take advantage of ancient DNA data that provides information on the allele frequency path of the beneficial allele. Applying our ABC, including both modern and ancient human DNA data, to four pigmentation alleles in Europeans, we detected selection on standing variants that occurred after the dispersal from Africa even though models of selection on a new mutation were initially supported for two of these alleles without the ancient data.
Collapse
Affiliation(s)
- Shigeki Nakagome
- 1 Department of Human Genetics, University of Chicago , Chicago, IL , USA.,3 School of Medicine, Faculty of Health Sciences, Trinity College Dublin, the University of Dublin , Dublin , Ireland
| | - Richard R Hudson
- 1 Department of Human Genetics, University of Chicago , Chicago, IL , USA.,2 Department of Ecology & Evolution, University of Chicago , Chicago, IL , USA
| | - Anna Di Rienzo
- 1 Department of Human Genetics, University of Chicago , Chicago, IL , USA
| |
Collapse
|
8
|
Chevin LM. Selective Sweep at a QTL in a Randomly Fluctuating Environment. Genetics 2019; 213:987-1005. [PMID: 31527049 PMCID: PMC6827380 DOI: 10.1534/genetics.119.302680] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 09/16/2019] [Indexed: 01/01/2023] Open
Abstract
Adaptation is mediated by phenotypic traits that are often near continuous, and undergo selective pressures that may change with the environment. The dynamics of allelic frequencies at underlying quantitative trait loci (QTL) depend on their own phenotypic effects, but also possibly on other polymorphic loci affecting the same trait, and on environmental change driving phenotypic selection. Most environments include a substantial component of random noise, characterized both by its magnitude and its temporal autocorrelation, which sets the timescale of environmental predictability. I investigate the dynamics of a mutation affecting a quantitative trait in an autocorrelated stochastic environment that causes random fluctuations of an optimum phenotype. The trait under selection may also exhibit background polygenic variance caused by many polymorphic loci of small effects elsewhere in the genome. In addition, the mutation at the QTL may affect phenotypic plasticity, the phenotypic response of given genotype to its environment of development or expression. Stochastic environmental fluctuations increase the variance of the evolutionary process, with consequences for the probability of a complete sweep at the QTL. Background polygenic variation critically alters this process, by setting an upper limit to stochastic variance of population genetics at the QTL. For a plasticity QTL, stochastic fluctuations also influences the expected selection coefficient, and alleles with the same expected trajectory can have very different stochastic variances. Finally, a mutation may be favored through its effect on plasticity despite causing a systematic mismatch with optimum, which is compensated by evolution of the mean background phenotype.
Collapse
Affiliation(s)
- Luis-Miguel Chevin
- Centre d'Ecologie Fonctionnelle et Evolutive (CEFE), CNRS, University of Montpellier, University of Paul Valéry Montpellier 3, EPHE, IRD, France
| |
Collapse
|
9
|
Abstract
For almost 20 years, many inference methods have been developed to detect selective sweeps and localize the targets of directional selection in the genome. These methods are based on population genetic models that describe the effect of a beneficial allele (e.g., a new mutation) on linked neutral variation (driven by directional selection from a single copy to fixation). Here, I discuss these models, ranging from selective sweeps in a panmictic population of constant size to evolutionary traffic when simultaneous sweeps at multiple loci interfere, and emphasize the important role of demography and population structure in data analysis. In the past 10 years, soft sweeps that may arise after an environmental change from directional selection on standing variation have become a focus of population genetic research. In contrast to selective sweeps, they are caused by beneficial alleles that were neutrally segregating in a population before the environmental change or were present at a mutation-selection balance in appreciable frequency.
Collapse
|
10
|
Detecting Recent Positive Selection with a Single Locus Test Bipartitioning the Coalescent Tree. Genetics 2017; 208:791-805. [PMID: 29217523 DOI: 10.1534/genetics.117.300401] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Accepted: 12/01/2017] [Indexed: 01/09/2023] Open
Abstract
Many population genomic studies have been conducted in the past to search for traces of recent events of positive selection. These traces, however, can be obscured by temporal variation of population size or other demographic factors. To reduce the confounding impact of demography, the coalescent tree topology has been used as an additional source of information for detecting recent positive selection in a population or a species. Based on the branching pattern at the root, we partition the hypothetical coalescent tree, inferred from a sequence sample, into two subtrees. The reasoning is that positive selection could impose a strong impact on branch length in one of the two subtrees while demography has the same effect on average on both subtrees. Thus, positive selection should be detectable by comparing statistics calculated for the two subtrees. Simulations demonstrate that the proposed test based on these principles has high power to detect recent positive selection even when DNA polymorphism data from only one locus is available, and that it is robust to the confounding effect of demography. One feature is that all components in the summary statistics ([Formula: see text]) can be computed analytically. Moreover, misinference of derived and ancestral alleles is seen to have only a limited effect on the test, and it therefore avoids a notorious problem when searching for traces of recent positive selection.
Collapse
|
11
|
Pavlidis P, Alachiotis N. A survey of methods and tools to detect recent and strong positive selection. ACTA ACUST UNITED AC 2017; 24:7. [PMID: 28405579 PMCID: PMC5385031 DOI: 10.1186/s40709-017-0064-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 03/29/2017] [Indexed: 01/25/2023]
Abstract
Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular LD patterns in the region. A variety of methods and tools can be used for detecting sweeps, ranging from simple implementations that compute summary statistics such as Tajima's D, to more advanced statistical approaches that use combinations of statistics, maximum likelihood, machine learning etc. In this survey, we present and discuss summary statistics and software tools, and classify them based on the selective sweep signature they detect, i.e., SFS-based vs. LD-based, as well as their capacity to analyze whole genomes or just subgenomic regions. Additionally, we summarize the results of comparisons among four open-source software releases (SweeD, SweepFinder, SweepFinder2, and OmegaPlus) regarding sensitivity, specificity, and execution times. In equilibrium neutral models or mild bottlenecks, both SFS- and LD-based methods are able to detect selective sweeps accurately. Methods and tools that rely on LD exhibit higher true positive rates than SFS-based ones under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to represent the null hypothesis. When the correct (or similar to the correct) demographic model is used instead, the false positive rates are considerably reduced. The accuracy of detecting the true target of selection is decreased in bottleneck scenarios. In terms of execution time, LD-based methods are typically faster than SFS-based methods, due to the nature of required arithmetic.
Collapse
Affiliation(s)
- Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| | - Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| |
Collapse
|
12
|
Huber CD, DeGiorgio M, Hellmann I, Nielsen R. Detecting recent selective sweeps while controlling for mutation rate and background selection. Mol Ecol 2015; 25:142-56. [PMID: 26290347 PMCID: PMC5082542 DOI: 10.1111/mec.13351] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/31/2015] [Accepted: 08/17/2015] [Indexed: 12/19/2022]
Abstract
A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson–Kreitman–Aguadé test, we suggest adding fixed differences relative to an out‐group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.
Collapse
Affiliation(s)
- Christian D Huber
- Max F. Perutz Laboratory, University of Vienna, Vienna, Austria.,Vienna Graduate School of Population Genetics, University of Veterinary Medicine, Vienna, Austria.,Department of Ecology and Evolutionary Biology, University of California, Los Angeles, 621 Charles E. Young Drive South, Los Angeles, CA, 90095-1606, USA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University, University Park, PA, USA.,Institute for CyberScience, Pennsylvania State University, University Park, PA, USA
| | - Ines Hellmann
- Department Biologie II, Ludwig-Maximilians-Universität München, Großhaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Rasmus Nielsen
- Departments of Integrative Biology and Statistics, University of California, Berkeley, CA, USA
| |
Collapse
|
13
|
Wollstein A, Stephan W. Inferring positive selection in humans from genomic data. INVESTIGATIVE GENETICS 2015; 6:5. [PMID: 25834723 PMCID: PMC4381672 DOI: 10.1186/s13323-015-0023-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 02/23/2015] [Indexed: 01/06/2023]
Abstract
Adaptation can be described as an evolutionary process that leads to an adjustment of the phenotypes of a population to their environment. In the classical view, new mutations can introduce novel phenotypic features into a population that leave footprints in the genome after fixation, such as selective sweeps. Alternatively, existing genetic variants may become beneficial after an environmental change and increase in frequency. Although they may not reach fixation, they may cause a shift of the optimum of a phenotypic trait controlled by multiple loci. With the availability of polymorphism data from various organisms, including humans and chimpanzees, it has become possible to detect molecular evidence of adaptation and to estimate the strength and target of positive selection. In this review, we discuss the two competing models of adaptation and suitable approaches for detecting the footprints of positive selection on the molecular level.
Collapse
Affiliation(s)
- Andreas Wollstein
- Section of Evolutionary Biology, Department of Biology II, University of Munich, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| | - Wolfgang Stephan
- Section of Evolutionary Biology, Department of Biology II, University of Munich, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| |
Collapse
|
14
|
Huber CD, Nordborg M, Hermisson J, Hellmann I. Keeping it local: evidence for positive selection in Swedish Arabidopsis thaliana. Mol Biol Evol 2014; 31:3026-39. [PMID: 25158800 PMCID: PMC4209139 DOI: 10.1093/molbev/msu247] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Detecting positive selection in species with heterogeneous habitats and complex demography is notoriously difficult and prone to statistical biases. The model plant Arabidopsis thaliana exemplifies this problem: In spite of the large amounts of data, little evidence for classic selective sweeps has been found. Moreover, many aspects of the demography are unclear, which makes it hard to judge whether the few signals are indeed signs of selection, or false positives caused by demographic events. Here, we focus on Swedish A. thaliana and we find that the demography can be approximated as a two-population model. Careful analysis of the data shows that such a two island model is characterized by a very old split time that significantly predates the last glacial maximum followed by secondary contact with strong migration. We evaluate selection based on this demography and find that this secondary contact model strongly affects the power to detect sweeps. Moreover, it affects the power differently for northern Sweden (more false positives) as compared with southern Sweden (more false negatives). However, even when the demographic history is accounted for, sweep signals in northern Sweden are stronger than in southern Sweden, with little or no positional overlap. Further simulations including the complex demography and selection confirm that this is not compatible with global selection acting on both populations, and thus can be taken as evidence for local selection within subpopulations of Swedish A. thaliana. This study demonstrates the necessity of combining demographic analyses and sweep scans for the detection of selection, particularly when selection acts predominantly local.
Collapse
Affiliation(s)
- Christian D Huber
- Mathematics and BioSciences Group, Max F. Perutz Laboratories, University of Vienna, Vienna, Austria Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Magnus Nordborg
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter, Vienna, Austria
| | - Joachim Hermisson
- Mathematics and BioSciences Group, Max F. Perutz Laboratories, University of Vienna, Vienna, Austria Department of Mathematics, University of Vienna, Vienna, Austria
| | - Ines Hellmann
- Department of Human Genetics & Anthropology, LMU, Munich, Germany
| |
Collapse
|
15
|
Fine-mapping and selective sweep analysis of QTL for cold tolerance in Drosophila melanogaster. G3-GENES GENOMES GENETICS 2014; 4:1635-45. [PMID: 24970882 PMCID: PMC4169155 DOI: 10.1534/g3.114.012757] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
There is a growing interest in investigating the relationship between genes with signatures of natural selection and genes identified in QTL mapping studies using combined population and quantitative genetics approaches. We dissected an X-linked interval of 6.2 Mb, which contains two QTL underlying variation in chill coma recovery time (CCRT) in Drosophila melanogaster from temperate (European) and tropical (African) regions. This resulted in two relatively small regions of 131 kb and 124 kb. The latter one co-localizes with a very strong selective sweep in the European population. We examined the genes within and near the sweep region individually using gene expression analysis and P-element insertion lines. Of the genes overlapping with the sweep, none appears to be related to CCRT. However, we have identified a new candidate gene of CCRT, brinker, which is located just outside the sweep region and is inducible by cold stress. We discuss these results in light of recent population genetics theories on quantitative traits.
Collapse
|
16
|
Abstract
The molecular signature of selection depends strongly on whether new mutations are immediately favorable and sweep to fixation (hard sweeps) as opposed to when selection acts on segregating variation (soft sweeps). The prediction of reduced sequence variation around selected polymorphisms is much stronger for hard than soft sweeps, particularly when considering quantitative traits where sweeps are likely to be incomplete. Here, we directly investigate the genomic signal of soft sweeps within an artificial selection experiment on Mimulus guttatus. We first develop a statistical method based on Fisher’s angular transformation of allele frequencies to identify selected loci. Application of this method identifies about 400 significant windows, but no fixed differences between phenotypically divergent populations. With two notable exceptions, we find a modest average effect of partial sweeps on the amount of molecular variation. The first exception is a polymorphic inversion on chromosome 6. The increase of the derived haplotype has a broad genomic effect due to recombination suppression coupled with substantial initial haplotype structure within the population. Second, we found significant increases in nucleotide variation around selected loci in the population evolving larger flowers. This suggests that “high” alleles for flower size were initially less frequent than “low” alleles. This result is consistent with prior studies of M. guttatus and illustrates how molecular evolution can depend on the allele frequency spectrum at quantitative trait loci.
Collapse
Affiliation(s)
- John K Kelly
- Department of Ecology and Evolutionary Biology, University of Kansas.
| | | | | |
Collapse
|
17
|
Ezawa K, Landan G, Graur D. Detecting negative selection on recurrent mutations using gene genealogy. BMC Genet 2013; 14:37. [PMID: 23651527 PMCID: PMC3661350 DOI: 10.1186/1471-2156-14-37] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 04/13/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Whether or not a mutant allele in a population is under selection is an important issue in population genetics, and various neutrality tests have been invented so far to detect selection. However, detection of negative selection has been notoriously difficult, partly because negatively selected alleles are usually rare in the population and have little impact on either population dynamics or the shape of the gene genealogy. Recently, through studies of genetic disorders and genome-wide analyses, many structural variations were shown to occur recurrently in the population. Such "recurrent mutations" might be revealed as deleterious by exploiting the signal of negative selection in the gene genealogy enhanced by their recurrence. RESULTS Motivated by the above idea, we devised two new test statistics. One is the total number of mutants at a recurrently mutating locus among sampled sequences, which is tested conditionally on the number of forward mutations mapped on the sequence genealogy. The other is the size of the most common class of identical-by-descent mutants in the sample, again tested conditionally on the number of forward mutations mapped on the sequence genealogy. To examine the performance of these two tests, we simulated recurrently mutated loci each flanked by sites with neutral single nucleotide polymorphisms (SNPs), with no recombination. Using neutral recurrent mutations as null models, we attempted to detect deleterious recurrent mutations. Our analyses demonstrated high powers of our new tests under constant population size, as well as their moderate power to detect selection in expanding populations. We also devised a new maximum parsimony algorithm that, given the states of the sampled sequences at a recurrently mutating locus and an incompletely resolved genealogy, enumerates mutation histories with a minimum number of mutations while partially resolving genealogical relationships when necessary. CONCLUSIONS With their considerably high powers to detect negative selection, our new neutrality tests may open new venues for dealing with the population genetics of recurrent mutations as well as help identifying some types of genetic disorders that may have escaped identification by currently existing methods.
Collapse
Affiliation(s)
- Kiyoshi Ezawa
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
- Present address: Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
| | - Giddy Landan
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
- Present address: Institute of Genomic Microbiology, Heinrich-Heine University Düsseldorf, Universitätsstr. 1, Düsseldorf 40225, Germany
| | - Dan Graur
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
| |
Collapse
|
18
|
Cutter AD, Jovelin R, Dey A. Molecular hyperdiversity and evolution in very large populations. Mol Ecol 2013; 22:2074-95. [PMID: 23506466 PMCID: PMC4065115 DOI: 10.1111/mec.12281] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 01/24/2013] [Accepted: 01/29/2013] [Indexed: 02/06/2023]
Abstract
The genomic density of sequence polymorphisms critically affects the sensitivity of inferences about ongoing sequence evolution, function and demographic history. Most animal and plant genomes have relatively low densities of polymorphisms, but some species are hyperdiverse with neutral nucleotide heterozygosity exceeding 5%. Eukaryotes with extremely large populations, mimicking bacterial and viral populations, present novel opportunities for studying molecular evolution in sexually reproducing taxa with complex development. In particular, hyperdiverse species can help answer controversial questions about the evolution of genome complexity, the limits of natural selection, modes of adaptation and subtleties of the mutation process. However, such systems have some inherent complications and here we identify topics in need of theoretical developments. Close relatives of the model organisms Caenorhabditis elegans and Drosophila melanogaster provide known examples of hyperdiverse eukaryotes, encouraging functional dissection of resulting molecular evolutionary patterns. We recommend how best to exploit hyperdiverse populations for analysis, for example, in quantifying the impact of noncrossover recombination in genomes and for determining the identity and micro-evolutionary selective pressures on noncoding regulatory elements.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada.
| | | | | |
Collapse
|
19
|
Hereward JP, Walter GH, Debarro PJ, Lowe AJ, Riginos C. Gene flow in the green mirid, Creontiades dilutus (Hemiptera: Miridae), across arid and agricultural environments with different host plant species. Ecol Evol 2013; 3:807-21. [PMID: 23610626 PMCID: PMC3631396 DOI: 10.1002/ece3.510] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Revised: 01/17/2013] [Accepted: 01/21/2013] [Indexed: 12/18/2022] Open
Abstract
Creontiades dilutus (Stål), the green mirid, is a polyphagous herbivorous insect endemic to Australia. Although common in the arid interior of Australia and found on several native host plants that are spatially and temporally ephemeral, green mirids also reach pest levels on several crops in eastern Australia. These host-associated dynamics, distributed across a large geographic area, raise questions as to whether (1) seasonal fluctuations in population size result in genetic bottlenecks and drift, (2) arid and agricultural populations are genetically isolated, and (3) the use of different host plants results in genetic differentiation. We sequenced a mitochondrial COI fragment from individuals collected over 24 years and screened microsatellite variation from 32 populations across two seasons. The predominance of a single COI haplotype and negative Tajima D in samples from 2006/2007 fit with a population expansion model. In the older collections (1983 and 1993), a different haplotype is most prevalent, consistent with successive population contractions and expansions. Microsatellite data indicates recent migration between inland sites and coastal crops and admixture in several populations. Altogether, the data suggest that long-distance dispersal occurs between arid and agricultural regions, and this, together with fluctuations in population size, leads to temporally dynamic patterns of genetic differentiation. Host-associated differentiation is evident between mirids sampled from plants in the genus Cullen (Fabaceae), the primary host, and alternative host plant species growing nearby in arid regions. Our results highlight the importance of jointly assessing natural and agricultural environments in understanding the ecology of pest insects.
Collapse
Affiliation(s)
- J P Hereward
- School of Biological Sciences, The University of Queensland Brisbane, Qld, 4072, Australia ; Cotton Catchment Communities CRC, Australian Cotton Research Institute Locked Mail Bag 1001, Narrabri, NSW, 2390, Australia
| | | | | | | | | |
Collapse
|
20
|
Chevin LM, Gallet R, Gomulkiewicz R, Holt RD, Fellous S. Phenotypic plasticity in evolutionary rescue experiments. Philos Trans R Soc Lond B Biol Sci 2013; 368:20120089. [PMID: 23209170 PMCID: PMC3538455 DOI: 10.1098/rstb.2012.0089] [Citation(s) in RCA: 103] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Population persistence in a new and stressful environment can be influenced by the plastic phenotypic responses of individuals to this environment, and by the genetic evolution of plasticity itself. This process has recently been investigated theoretically, but testing the quantitative predictions in the wild is challenging because (i) there are usually not enough population replicates to deal with the stochasticity of the evolutionary process, (ii) environmental conditions are not controlled, and (iii) measuring selection and the inheritance of traits affecting fitness is difficult in natural populations. As an alternative, predictions from theory can be tested in the laboratory with controlled experiments. To illustrate the feasibility of this approach, we briefly review the literature on the experimental evolution of plasticity, and on evolutionary rescue in the laboratory, paying particular attention to differences and similarities between microbes and multicellular eukaryotes. We then highlight a set of questions that could be addressed using this framework, which would enable testing the robustness of theoretical predictions, and provide new insights into areas that have received little theoretical attention to date.
Collapse
Affiliation(s)
- Luis-Miguel Chevin
- Centre d'Ecologie Fonctionnelle et Evolutive (UMR 5175), 1919 route de Mende, 34293 Montpellier Cedex 5, France.
| | | | | | | | | |
Collapse
|
21
|
Duchen P, Zivkovic D, Hutter S, Stephan W, Laurent S. Demographic inference reveals African and European admixture in the North American Drosophila melanogaster population. Genetics 2013; 193:291-301. [PMID: 23150605 PMCID: PMC3527251 DOI: 10.1534/genetics.112.145912] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Accepted: 10/30/2012] [Indexed: 11/18/2022] Open
Abstract
Drosophila melanogaster spread from sub-Saharan Africa to the rest of the world colonizing new environments. Here, we modeled the joint demography of African (Zimbabwe), European (The Netherlands), and North American (North Carolina) populations using an approximate Bayesian computation (ABC) approach. By testing different models (including scenarios with continuous migration), we found that admixture between Africa and Europe most likely generated the North American population, with an estimated proportion of African ancestry of 15%. We also revisited the demography of the ancestral population (Africa) and found-in contrast to previous work-that a bottleneck fits the history of the population of Zimbabwe better than expansion. Finally, we compared the site-frequency spectrum of the ancestral population to analytical predictions under the estimated bottleneck model.
Collapse
Affiliation(s)
- Pablo Duchen
- Evolutionary Biology, University of Munich, 82152 Planegg-Martinsried, Germany.
| | | | | | | | | |
Collapse
|
22
|
Inference of population structure of Leishmania donovani strains isolated from different Ethiopian visceral leishmaniasis endemic areas. PLoS Negl Trop Dis 2010; 4:e889. [PMID: 21103373 PMCID: PMC2982834 DOI: 10.1371/journal.pntd.0000889] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 10/21/2010] [Indexed: 11/28/2022] Open
Abstract
Background Parasites' evolution in response to parasite-targeted control strategies, such as vaccines and drugs, is known to be influenced by their population genetic structure. The aim of this study was to describe the population structure of Ethiopian strains of Leishmania donovani derived from different areas endemic for visceral leishmaniasis (VL) as a prerequisite for the design of effective control strategies against the disease. Methodology/Principal Findings Sixty-three strains of L. donovani newly isolated from VL cases in the two main Ethiopian foci, in the north Ethiopia (NE) and south Ethiopia (SE) of the country were investigated by using 14 highly polymorphic microsatellite markers. The microsatellite profiles of 60 previously analysed L. donovani strains from Sudan, Kenya and India were included for comparison. Multilocus microsatellite typing placed strains from SE and Kenya (n = 30) in one population and strains from NE and Sudan (n = 65) in another. These two East African populations corresponded to the areas of distribution of two different sand fly vectors. In NE and Sudan Phlebotomus orientalis has been implicated to transmit the parasites and in SE and Kenya P. martini. The genetic differences between parasites from NE and SE are also congruent with some phenotypic differences. Each of these populations was further divided into two subpopulations. Interestingly, in one of the subpopulations of the population NE we observed predominance of strains isolated from HIV-VL co-infected patients and of strains with putative hybrid genotypes. Furthermore, high inbreeding irreconcilable from strict clonal reproduction was found for strains from SE and Kenya indicating a mixed-mating system. Conclusions/Significance This study identified a hierarchical population structure of L. donovani in East Africa. The existence of two main, genetically and geographically separated, populations could reflect different parasite-vector associations, different ecologies and varying host backgrounds and should be further investigated. In the Horn of Africa, visceral leishmaniasis, caused by protozoan parasites of the Leishmania donovani complex, continues to pose a major health problem affecting the poorest of the poor. Population genetic studies are crucial for the development of drugs and vaccines against microorganisms. However, our knowledge about the population structure of L. donovani parasites in this region is still very limited. Using a highly discriminatory multilocus microsatellite typing approach, we found a remarkably high genetic diversity among the East African strains of L. donovani studied which grouped into two genetically and geographically distinct populations comprising parasites from SE and Kenya, and those from NE and Sudan. Despite Leishmania being widely regarded as a clonal organism, our results suggest a possible co-existence of clonal and sexually reproducing strains of L. donovani from SE. The information obtained by the present study is helpful for future design of parasite-targeted control measures in East Africa.
Collapse
|
23
|
Stephan W. Genetic hitchhiking versus background selection: the controversy and its implications. Philos Trans R Soc Lond B Biol Sci 2010; 365:1245-53. [PMID: 20308100 DOI: 10.1098/rstb.2009.0278] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The controversy on the relative importance of background selection (BGS; against deleterious mutations) and genetic hitchhiking (associated with positive directional selection) in explaining patterns of nucleotide variation in natural populations stimulated research activities for almost a decade. Despite efforts from many theorists and empiricists, fundamental questions are still open, in particular, for the population genetics of regions of reduced recombination. On the other hand, the development of the BGS and hitchhiking models and the long struggle to distinguish them, all of which seem to be a purely academic exercise, led to quite practical advances that are useful for the identification of genes involved in adaptation and domestication.
Collapse
Affiliation(s)
- Wolfgang Stephan
- Section of Evolutionary Biology, Department of Biology II, Ludwig-Maximilians University Munich, , Grosshaderner Strasse 2, 82152 Planegg, Germany.
| |
Collapse
|
24
|
Linnen CR, Hoekstra HE. Measuring natural selection on genotypes and phenotypes in the wild. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2010; 74:155-68. [PMID: 20413707 DOI: 10.1101/sqb.2009.74.045] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
A complete understanding of the role of natural selection in driving evolutionary change requires accurate estimates of the strength of selection acting in the wild. Accordingly, several approaches using a variety of data-including patterns of DNA variability, spatial and temporal changes in allele frequencies, and fitness estimates-have been developed to identify and quantify selection on both genotypes and phenotypes. Here, we review these approaches, drawing on both recent and classic examples to illustrate their utility and limitations. We then argue that by combining estimates of selection at multiple levels-from individual mutations to phenotypes-and at multiple timescales-from ecological to evolutionary-with experiments that demonstrate why traits are under selection, we can gain a much more complete picture of the adaptive process.
Collapse
Affiliation(s)
- C R Linnen
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | | |
Collapse
|
25
|
Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations. Genetics 2010; 185:907-22. [PMID: 20407129 DOI: 10.1534/genetics.110.116459] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
A major goal of population genomics is to reconstruct the history of natural populations and to infer the neutral and selective scenarios that can explain the present-day polymorphism patterns. However, the separation between neutral and selective hypotheses has proven hard, mainly because both may predict similar patterns in the genome. This study focuses on the development of methods that can be used to distinguish neutral from selective hypotheses in equilibrium and nonequilibrium populations. These methods utilize a combination of statistics on the basis of the site frequency spectrum (SFS) and linkage disequilibrium (LD). We investigate the patterns of genetic variation along recombining chromosomes using a multitude of comparisons between neutral and selective hypotheses, such as selection or neutrality in equilibrium and nonequilibrium populations and recurrent selection models. We perform hypothesis testing using the classical P-value approach, but we also introduce methods from the machine-learning field. We demonstrate that the combination of SFS- and LD-based statistics increases the power to detect recent positive selection in populations that have experienced past demographic changes.
Collapse
|
26
|
Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genet 2010; 6:e1000862. [PMID: 20195501 PMCID: PMC2829049 DOI: 10.1371/journal.pgen.1000862] [Citation(s) in RCA: 1112] [Impact Index Per Article: 79.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 01/28/2010] [Indexed: 11/19/2022] Open
Abstract
Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP-based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural populations, and identifying new genomic regions and candidate genes of evolutionary significance.
Collapse
Affiliation(s)
- Paul A. Hohenlohe
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, Oregon, United States of America
| | - Susan Bassham
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, Oregon, United States of America
| | - Paul D. Etter
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, United States of America
| | - Nicholas Stiffler
- Genomics Core Facility, University of Oregon, Eugene, Oregon, United States of America
| | - Eric A. Johnson
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, United States of America
| | - William A. Cresko
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, Oregon, United States of America
| |
Collapse
|
27
|
Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genet 2010. [PMID: 20195501 DOI: 10.1371/journal.pgen.1000862.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP-based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural populations, and identifying new genomic regions and candidate genes of evolutionary significance.
Collapse
|
28
|
González J, Macpherson JM, Petrov DA. A recent adaptive transposable element insertion near highly conserved developmental loci in Drosophila melanogaster. Mol Biol Evol 2009; 26:1949-61. [PMID: 19458110 DOI: 10.1093/molbev/msp107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A recent genomewide screen identified 13 transposable elements that are likely to have been adaptive during or after the spread of Drosophila melanogaster out of Africa. One of these insertions, Bari-Juvenile hormone epoxy hydrolase (Bari-Jheh), was associated with the selective sweep of its flanking neutral variation and with reduction of expression of one of its neighboring genes: Jheh3. Here, we provide further evidence that Bari-Jheh insertion is adaptive. We delimit the extent of the selective sweep and show that Bari-Jheh is the only mutation linked to the sweep. Bari-Jheh also lowers the expression of its other flanking gene, Jheh2. Subtle consequences of Bari-Jheh insertion on life-history traits are consistent with the effects of reduced expression of the Jheh genes. Finally, we analyze molecular evolution of Jheh genes in both the long- and the short-term and conclude that Bari-Jheh appears to be a very rare adaptive event in the history of these genes. We discuss the implications of these findings for the detection and understanding of adaptation.
Collapse
|