1
|
Sudbrack V, Mullon C. Fixation times of de novo and standing beneficial variants in subdivided populations. Genetics 2024; 227:iyae043. [PMID: 38527860 DOI: 10.1093/genetics/iyae043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 01/17/2024] [Accepted: 03/11/2024] [Indexed: 03/27/2024] Open
Abstract
The rate at which beneficial alleles fix in a population depends on the probability of and time to fixation of such alleles. Both of these quantities can be significantly impacted by population subdivision and limited gene flow. Here, we investigate how limited dispersal influences the rate of fixation of beneficial de novo mutations, as well as fixation time from standing genetic variation. We investigate this for a population structured according to the island model of dispersal allowing us to use the diffusion approximation, which we complement with simulations. We find that fixation may take on average fewer generations under limited dispersal than under panmixia when selection is moderate. This is especially the case if adaptation occurs from de novo recessive mutations, and dispersal is not too limited (such that approximately FST<0.2). The reason is that mildly limited dispersal leads to only a moderate increase in effective population size (which slows down fixation), but is sufficient to cause a relative excess of homozygosity due to inbreeding, thereby exposing rare recessive alleles to selection (which accelerates fixation). We also explore the effect of metapopulation dynamics through local extinction followed by recolonization, finding that such dynamics always accelerate fixation from standing genetic variation, while de novo mutations show faster fixation interspersed with longer waiting times. Finally, we discuss the implications of our results for the detection of sweeps, suggesting that limited dispersal mitigates the expected differences between the genetic signatures of sweeps involving recessive and dominant alleles.
Collapse
Affiliation(s)
- Vitor Sudbrack
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Vaud, Switzerland
| | - Charles Mullon
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Vaud, Switzerland
| |
Collapse
|
2
|
van der Valk T, Jensen A, Caillaud D, Guschanski K. Comparative genomic analyses provide new insights into evolutionary history and conservation genomics of gorillas. BMC Ecol Evol 2024; 24:14. [PMID: 38273244 PMCID: PMC10811819 DOI: 10.1186/s12862-023-02195-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 12/22/2023] [Indexed: 01/27/2024] Open
Abstract
Genome sequencing is a powerful tool to understand species evolutionary history, uncover genes under selection, which could be informative of local adaptation, and infer measures of genetic diversity, inbreeding and mutational load that could be used to inform conservation efforts. Gorillas, critically endangered primates, have received considerable attention and with the recently sequenced Bwindi mountain gorilla population, genomic data is now available from all gorilla subspecies and both mountain gorilla populations. Here, we reanalysed this rich dataset with a focus on evolutionary history, local adaptation and genomic parameters relevant for conservation. We estimate a recent split between western and eastern gorillas of 150,000-180,000 years ago, with gene flow around 20,000 years ago, primarily between the Cross River and Grauer's gorilla subspecies. This gene flow event likely obscures evolutionary relationships within eastern gorillas: after excluding putatively introgressed genomic regions, we uncover a sister relationship between Virunga mountain gorillas and Grauer's gorillas to the exclusion of Bwindi mountain gorillas. This makes mountain gorillas paraphyletic. Eastern gorillas are less genetically diverse and more inbred than western gorillas, yet we detected lower genetic load in the eastern species. Analyses of indels fit remarkably well with differences in genetic diversity across gorilla taxa as recovered with nucleotide diversity measures. We also identified genes under selection and unique gene variants specific for each gorilla subspecies, encoding, among others, traits involved in immunity, diet, muscular development, hair morphology and behavior. The presence of this functional variation suggests that the subspecies may be locally adapted. In conclusion, using extensive genomic resources we provide a comprehensive overview of gorilla genomic diversity, including a so-far understudied Bwindi mountain gorilla population, identify putative genes involved in local adaptation, and detect population-specific gene flow across gorilla species.
Collapse
Affiliation(s)
- Tom van der Valk
- Centre for Palaeogenetics, Stockholm, Sweden.
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden.
- SciLifeLab, Stockholm, Sweden.
- Department of Zoology, Stockholm University, Stockholm, Sweden.
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | - Damien Caillaud
- Department of Anthropology, University of CA - Davis, Davis, California, USA
| | - Katerina Guschanski
- SciLifeLab, Stockholm, Sweden
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
3
|
Galić V, Anđelković V, Kravić N, Grčić N, Ledenčan T, Jambrović A, Zdunić Z, Nicolas S, Charcosset A, Šatović Z, Šimić D. Genetic diversity and selection signatures in a gene bank panel of maize inbred lines from Southeast Europe compared with two West European panels. BMC PLANT BIOLOGY 2023; 23:315. [PMID: 37316827 PMCID: PMC10265872 DOI: 10.1186/s12870-023-04336-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/07/2023] [Indexed: 06/16/2023]
Abstract
Southeast Europe (SEE) is a very important maize-growing region, comparable to the Corn belt region of the United States, with similar dent germplasm (dent by dent hybrids). Historically, this region has undergone several genetic material swaps, following the trends in the US, with one of the most significant swaps related to US aid programs after WWII. The imported accessions used to make double-cross hybrids were also mixed with previously adapted germplasm originating from several more distant OPVs, supporting the transition to single cross-breeding. Many of these materials were deposited at the Maize Gene Bank of the Maize Research Institute Zemun Polje (MRIZP) between the 1960s and 1980s. A part of this Gene Bank (572 inbreds) was genotyped with Affymetrix Axiom Maize Genotyping Array with 616,201 polymorphic variants. Data were merged with two other genotyping datasets with mostly European flint (TUM dataset) and dent (DROPS dataset) germplasm. The final pan-European dataset consisted of 974 inbreds and 460,243 markers. Admixture analysis showed seven ancestral populations representing European flint, B73/B14, Lancaster, B37, Wf9/Oh07, A374, and Iodent pools. Subpanel of inbreds with SEE origin showed a lack of Iodent germplasm, marking its historical context. Several signatures of selection were identified at chromosomes 1, 3, 6, 7, 8, 9, and 10. The regions under selection were mined for protein-coding genes and were used for gene ontology (GO) analysis, showing a highly significant overrepresentation of genes involved in response to stress. Our results suggest the accumulation of favorable allelic diversity, especially in the context of changing climate in the genetic resources of SEE.
Collapse
Affiliation(s)
- Vlatko Galić
- Agricultural Institute Osijek, Južno predgrađe 17, Osijek, HR31000, Croatia.
- Centre of Excellence for Biodiversity and Molecular Plant Breeding (CroP-BioDiv), Svetošimunska cesta 25, Zagreb, HR10000, Croatia.
| | - Violeta Anđelković
- Maize Research Institute Zemun Polje, Slobodana Bajića 1, Belgrade, 11185, Serbia
| | - Natalija Kravić
- Maize Research Institute Zemun Polje, Slobodana Bajića 1, Belgrade, 11185, Serbia
| | - Nikola Grčić
- Maize Research Institute Zemun Polje, Slobodana Bajića 1, Belgrade, 11185, Serbia
| | - Tatjana Ledenčan
- Agricultural Institute Osijek, Južno predgrađe 17, Osijek, HR31000, Croatia
| | - Antun Jambrović
- Agricultural Institute Osijek, Južno predgrađe 17, Osijek, HR31000, Croatia
- Centre of Excellence for Biodiversity and Molecular Plant Breeding (CroP-BioDiv), Svetošimunska cesta 25, Zagreb, HR10000, Croatia
| | - Zvonimir Zdunić
- Agricultural Institute Osijek, Južno predgrađe 17, Osijek, HR31000, Croatia
- Centre of Excellence for Biodiversity and Molecular Plant Breeding (CroP-BioDiv), Svetošimunska cesta 25, Zagreb, HR10000, Croatia
| | - Stéphane Nicolas
- GQE ‑ Le Moulon, INRAE, Univ. Paris‑Sud, CNRS, AgroParisTech, Université Paris-Saclay, Gif‑sur‑Yvette, 91190, France
| | - Alain Charcosset
- GQE ‑ Le Moulon, INRAE, Univ. Paris‑Sud, CNRS, AgroParisTech, Université Paris-Saclay, Gif‑sur‑Yvette, 91190, France
| | - Zlatko Šatović
- Centre of Excellence for Biodiversity and Molecular Plant Breeding (CroP-BioDiv), Svetošimunska cesta 25, Zagreb, HR10000, Croatia
- Faculty of Agriculture, University of Zagreb, Svetošimunska cesta 25, Zagreb, HR10000, Croatia
| | - Domagoj Šimić
- Agricultural Institute Osijek, Južno predgrađe 17, Osijek, HR31000, Croatia
- Centre of Excellence for Biodiversity and Molecular Plant Breeding (CroP-BioDiv), Svetošimunska cesta 25, Zagreb, HR10000, Croatia
| |
Collapse
|
4
|
Higgins J, Santos B, Khanh TD, Trung KH, Duong TD, Doai NTP, Hall A, Dyer S, Ham LH, Caccamo M, De Vega J. Genomic regions and candidate genes selected during the breeding of rice in Vietnam. Evol Appl 2022; 15:1141-1161. [PMID: 35899250 PMCID: PMC9309459 DOI: 10.1111/eva.13433] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 04/28/2022] [Accepted: 05/25/2022] [Indexed: 11/29/2022] Open
Abstract
Vietnam harnesses a rich diversity of rice landraces adapted to a range of conditions, which constitute a largely untapped source of diversity for the continuous improvement of cultivars. We previously identified a strong population structure in Vietnamese rice, which is captured in five Indica and four Japonica subpopulations, including an outlying Indica‐5 group. Here, we leveraged that strong differentiation and 672 native rice genomes to identify genomic regions and genes putatively selected during the breeding of rice in Vietnam. We identified significant distorted patterns in allele frequency (XP‐CLR) and population differentiation scores (FST) resulting from differential selective pressures between native subpopulations, and later annotated them with QTLs previously identified by GWAS in the same panel. We particularly focussed on the outlying Indica‐5 subpopulation because of its likely novelty and differential evolution, where we annotated 52 selected regions, which represented 8.1% of the rice genome. We annotated the 4576 genes in these regions and selected 65 candidate genes as promising breeding targets, several of which harboured alleles with nonsynonymous substitutions. Our results highlight genomic differences between traditional Vietnamese landraces, which are likely the product of adaption to multiple environmental conditions and regional culinary preferences in a very diverse country. We also verified the applicability of this genome scanning approach to identify potential regions harbouring novel loci and alleles to breed a new generation of sustainable and resilient rice.
Collapse
Affiliation(s)
| | | | - Tran Dang Khanh
- Agriculture Genetics Institute (AGI) Hanoi Vietnam
- Vietnam National University of Agriculture Hanoi Vietnam
| | | | | | | | - Anthony Hall
- Earlham Institute Norwich Research Park Norwich UK
| | | | - Le Huy Ham
- Agriculture Genetics Institute (AGI) Hanoi Vietnam
| | | | - Jose De Vega
- Earlham Institute Norwich Research Park Norwich UK
| |
Collapse
|
5
|
Klassmann A, Gautier M. Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data. PLoS One 2022; 17:e0262024. [PMID: 35041674 PMCID: PMC8765611 DOI: 10.1371/journal.pone.0262024] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 12/15/2021] [Indexed: 12/19/2022] Open
Abstract
Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.
Collapse
Affiliation(s)
| | - Mathieu Gautier
- CBGP, Univ Montpellier, CIRAD, INRAE, IRD, Institut Agro, Montpellier, France
| |
Collapse
|
6
|
Kaushik S, Jain K. Time to fixation in changing environments. Genetics 2021; 219:6369518. [PMID: 34740251 DOI: 10.1093/genetics/iyab148] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/02/2021] [Indexed: 01/18/2023] Open
Abstract
Although many experimental and theoretical studies on natural selection have been carried out in a constant environment, as natural environments typically vary in time, it is important to ask if and how the results of these investigations are affected by a changing environment. Here, we study the properties of the conditional fixation time defined as the time to fixation of a new mutant that is destined to fix in a finite, randomly mating diploid population with intermediate dominance that is evolving in a periodically changing environment. It is known that in a static environment, the conditional mean fixation time of a co-dominant beneficial mutant is equal to that of a deleterious mutant with the same magnitude of selection coefficient. We find that this symmetry is not preserved, even when the environment is changing slowly. More generally, we find that the conditional mean fixation time of an initially beneficial mutant in a slowly changing environment depends weakly on the dominance coefficient and remains close to the corresponding result in the static environment. However, for an initially deleterious mutant under moderate and slowly varying selection, the fixation time differs substantially from that in a constant environment when the mutant is recessive. As fixation times are intimately related to the levels and patterns of genetic diversity, our results suggest that for beneficial sweeps, these quantities are only mildly affected by temporal variation in environment. In contrast, environmental change is likely to impact the patterns due to recessive deleterious sweeps strongly.
Collapse
Affiliation(s)
- Sachin Kaushik
- Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore 560064, India
| | - Kavita Jain
- Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore 560064, India
| |
Collapse
|
7
|
Bisschop G, Lohse K, Setter D. Sweeps in time: leveraging the joint distribution of branch lengths. Genetics 2021; 219:iyab119. [PMID: 34849880 PMCID: PMC8633083 DOI: 10.1093/genetics/iyab119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/10/2021] [Indexed: 11/14/2022] Open
Abstract
Current methods of identifying positively selected regions in the genome are limited in two key ways: the underlying models cannot account for the timing of adaptive events and the comparison between models of selective sweeps and sequence data is generally made via simple summaries of genetic diversity. Here, we develop a tractable method of describing the effect of positive selection on the genealogical histories in the surrounding genome, explicitly modeling both the timing and context of an adaptive event. In addition, our framework allows us to go beyond analyzing polymorphism data via the site frequency spectrum or summaries thereof and instead leverage information contained in patterns of linked variants. Tests on both simulations and a human data example, as well as a comparison to SweepFinder2, show that even with very small sample sizes, our analytic framework has higher power to identify old selective sweeps and to correctly infer both the time and strength of selection. Finally, we derived the marginal distribution of genealogical branch lengths at a locus affected by selection acting at a linked site. This provides a much-needed link between our analytic understanding of the effects of sweeps on sequence variation and recent advances in simulation and heuristic inference procedures that allow researchers to examine the sequence of genealogical histories along the genome.
Collapse
Affiliation(s)
- Gertjan Bisschop
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Derek Setter
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| |
Collapse
|
8
|
Johri P, Charlesworth B, Howell EK, Lynch M, Jensen JD. Revisiting the Notion of Deleterious Sweeps. Genetics 2021; 219:6298596. [PMID: 34125884 PMCID: PMC9101445 DOI: 10.1093/genetics/iyab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022] Open
Abstract
It has previously been shown that, conditional on its fixation, the time to fixation of a semi-dominant deleterious autosomal mutation in a randomly mating population is the same as that of an advantageous mutation. This result implies that deleterious mutations could generate selective sweep-like effects. Although their fixation probabilities greatly differ, the much larger input of deleterious relative to beneficial mutations suggests that this phenomenon could be important. We here examine how the fixation of mildly deleterious mutations affects levels and patterns of polymorphism at linked sites - both in the presence and absence of interference amongst deleterious mutations - and how this class of sites may contribute to divergence between-populations and species. We find that, while deleterious fixations are unlikely to represent a significant proportion of outliers in polymorphism-based genomic scans within populations, minor shifts in the frequencies of deleterious mutations can influence the proportions of private variants and the value of FST after a recent population split. As sites subject to deleterious mutations are necessarily found in functional genomic regions, interpretations in terms of recurrent positive selection may require reconsideration.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Emma K Howell
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States.,Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| |
Collapse
|
9
|
Zeng K, Charlesworth B, Hobolth A. Studying models of balancing selection using phase-type theory. Genetics 2021; 218:6237896. [PMID: 33871627 DOI: 10.1093/genetics/iyab055] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 03/25/2021] [Indexed: 11/15/2022] Open
Abstract
Balancing selection (BLS) is the evolutionary force that maintains high levels of genetic variability in many important genes. To further our understanding of its evolutionary significance, we analyze models with BLS acting on a biallelic locus: an equilibrium model with long-term BLS, a model with long-term BLS and recent changes in population size, and a model of recent BLS. Using phase-type theory, a mathematical tool for analyzing continuous time Markov chains with an absorbing state, we examine how BLS affects polymorphism patterns in linked neutral regions, as summarized by nucleotide diversity, the expected number of segregating sites, the site frequency spectrum, and the level of linkage disequilibrium (LD). Long-term BLS affects polymorphism patterns in a relatively small genomic neighborhood, and such selection targets are easier to detect when the equilibrium frequencies of the selected variants are close to 50%, or when there has been a population size reduction. For a new mutation subject to BLS, its initial increase in frequency in the population causes linked neutral regions to have reduced diversity, an excess of both high and low frequency derived variants, and elevated LD with the selected locus. These patterns are similar to those produced by selective sweeps, but the effects of recent BLS are weaker. Nonetheless, compared to selective sweeps, nonequilibrium polymorphism and LD patterns persist for a much longer period under recent BLS, which may increase the chance of detecting such selection targets. An R package for analyzing these models, among others (e.g., isolation with migration), is available.
Collapse
Affiliation(s)
- Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Asger Hobolth
- Department of Mathematics, Aarhus University, Aarhus DK-8000, Denmark
| |
Collapse
|
10
|
Blischak PD, Barker MS, Gutenkunst RN. Inferring the Demographic History of Inbred Species from Genome-Wide SNP Frequency Data. Mol Biol Evol 2021; 37:2124-2136. [PMID: 32068861 DOI: 10.1093/molbev/msaa042] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 02/04/2020] [Accepted: 02/13/2020] [Indexed: 01/04/2023] Open
Abstract
Demographic inference using the site frequency spectrum (SFS) is a common way to understand historical events affecting genetic variation. However, most methods for estimating demography from the SFS assume random mating within populations, precluding these types of analyses in inbred populations. To address this issue, we developed a model for the expected SFS that includes inbreeding by parameterizing individual genotypes using beta-binomial distributions. We then take the convolution of these genotype probabilities to calculate the expected frequency of biallelic variants in the population. Using simulations, we evaluated the model's ability to coestimate demography and inbreeding using one- and two-population models across a range of inbreeding levels. We also applied our method to two empirical examples, American pumas (Puma concolor) and domesticated cabbage (Brassica oleracea var. capitata), inferring models both with and without inbreeding to compare parameter estimates and model fit. Our simulations showed that we are able to accurately coestimate demographic parameters and inbreeding even for highly inbred populations (F = 0.9). In contrast, failing to include inbreeding generally resulted in inaccurate parameter estimates in simulated data and led to poor model fit in our empirical analyses. These results show that inbreeding can have a strong effect on demographic inference, a pattern that was especially noticeable for parameters involving changes in population size. Given the importance of these estimates for informing practices in conservation, agriculture, and elsewhere, our method provides an important advancement for accurately estimating the demographic histories of these species.
Collapse
Affiliation(s)
- Paul D Blischak
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ.,Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ
| | - Michael S Barker
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ
| |
Collapse
|
11
|
The population genomics of adaptive loss of function. Heredity (Edinb) 2021; 126:383-395. [PMID: 33574599 PMCID: PMC7878030 DOI: 10.1038/s41437-021-00403-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 12/28/2020] [Accepted: 01/01/2021] [Indexed: 12/23/2022] Open
Abstract
Discoveries of adaptive gene knockouts and widespread losses of complete genes have in recent years led to a major rethink of the early view that loss-of-function alleles are almost always deleterious. Today, surveys of population genomic diversity are revealing extensive loss-of-function and gene content variation, yet the adaptive significance of much of this variation remains unknown. Here we examine the evolutionary dynamics of adaptive loss of function through the lens of population genomics and consider the challenges and opportunities of studying adaptive loss-of-function alleles using population genetics models. We discuss how the theoretically expected existence of allelic heterogeneity, defined as multiple functionally analogous mutations at the same locus, has proven consistent with empirical evidence and why this impedes both the detection of selection and causal relationships with phenotypes. We then review technical progress towards new functionally explicit population genomic tools and genotype-phenotype methods to overcome these limitations. More broadly, we discuss how the challenges of studying adaptive loss of function highlight the value of classifying genomic variation in a way consistent with the functional concept of an allele from classical population genetics.
Collapse
|
12
|
Charlesworth B. How Good Are Predictions of the Effects of Selective Sweeps on Levels of Neutral Diversity? Genetics 2020; 216:1217-1238. [PMID: 33106248 PMCID: PMC7768247 DOI: 10.1534/genetics.120.303734] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 10/22/2020] [Indexed: 11/18/2022] Open
Abstract
Selective sweeps are thought to play a significant role in shaping patterns of variability across genomes; accurate predictions of their effects are, therefore, important for understanding these patterns. A commonly used model of selective sweeps assumes that alleles sampled at the end of a sweep, and that fail to recombine with wild-type haplotypes during the sweep, coalesce instantaneously, leading to a simple expression for sweep effects on diversity. It is shown here that there can be a significant probability that a pair of alleles sampled at the end of a sweep coalesce during the sweep before a recombination event can occur, reducing their expected coalescent time below that given by the simple approximation. Expressions are derived for the expected reductions in pairwise neutral diversities caused by both single and recurrent sweeps in the presence of such within-sweep coalescence, although the effects of multiple recombination events during a sweep are only treated heuristically. The accuracies of the resulting expressions were checked against the results of simulations. For even moderate ratios of the recombination rate to the selection coefficient, the simple approximation can be substantially inaccurate. The selection model used here can be applied to favorable mutations with arbitrary dominance coefficients, to sex-linked loci with sex-specific selection coefficients, and to inbreeding populations. Using the results from this model, the expected differences between the levels of variability on X chromosomes and autosomes with selection at linked sites are discussed, and compared with data on a population of Drosophila melanogaster.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| |
Collapse
|
13
|
Boyrie L, Moreau C, Frugier F, Jacquet C, Bonhomme M. A linkage disequilibrium-based statistical test for Genome-Wide Epistatic Selection Scans in structured populations. Heredity (Edinb) 2020; 126:77-91. [PMID: 32728044 DOI: 10.1038/s41437-020-0349-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Revised: 07/21/2020] [Accepted: 07/21/2020] [Indexed: 01/16/2023] Open
Abstract
The quest for signatures of selection using single nucleotide polymorphism (SNP) data has proven efficient to uncover genes involved in conserved and/or adaptive molecular functions, but none of the statistical methods were designed to identify interacting alleles as targets of selective processes. Here, we propose a statistical test aimed at detecting epistatic selection, based on a linkage disequilibrium (LD) measure accounting for population structure and heterogeneous relatedness between individuals. SNP-based ([Formula: see text]) and window-based ([Formula: see text]) statistics fit a Student distribution, allowing to test the significance of correlation coefficients. As a proof of concept, we use SNP data from the Medicago truncatula symbiotic legume plant and uncover a previously unknown gene coadaptation between the MtSUNN (Super Numeric Nodule) receptor and the MtCLE02 (CLAVATA3-Like) signaling peptide. We also provide experimental evidence supporting a MtSUNN-dependent negative role of MtCLE02 in symbiotic root nodulation. Using human HGDP-CEPH SNP data, our new statistical test uncovers strong LD between SLC24A5 (skin pigmentation) and EDAR (hairs, teeth, sweat glands development) world-wide, which persists after correction for population structure and relatedness in Central South Asian populations. This result suggests that epistatic selection or coselection could have contributed to the phenotypic make-up in some human populations. Applying this approach to genome-wide SNP data will facilitate the identification of coadapted gene networks in model or non-model organisms.
Collapse
Affiliation(s)
- Léa Boyrie
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, Centre National de la Recherche Scientifique (CNRS), Université Paul Sabatier (UPS), Castanet-Tolosan, France
| | - Corentin Moreau
- Institute of Plant Sciences-Paris Saclay (IPS2), Centre National de la Recherche Scientifique, Univ Paris-Sud, Univ Paris-Diderot, Univ d'Evry, Institut National de la Recherche Agronomique, Université Paris-Saclay, 91192, Gif-sur-Yvette, France
| | - Florian Frugier
- Institute of Plant Sciences-Paris Saclay (IPS2), Centre National de la Recherche Scientifique, Univ Paris-Sud, Univ Paris-Diderot, Univ d'Evry, Institut National de la Recherche Agronomique, Université Paris-Saclay, 91192, Gif-sur-Yvette, France
| | - Christophe Jacquet
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, Centre National de la Recherche Scientifique (CNRS), Université Paul Sabatier (UPS), Castanet-Tolosan, France
| | - Maxime Bonhomme
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, Centre National de la Recherche Scientifique (CNRS), Université Paul Sabatier (UPS), Castanet-Tolosan, France.
| |
Collapse
|