1
|
Sundar Panja A. The systematic codon usage bias has an important effect on genetic adaption in native species. Gene 2024; 926:148627. [PMID: 38823656 DOI: 10.1016/j.gene.2024.148627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 05/06/2024] [Accepted: 05/29/2024] [Indexed: 06/03/2024]
Abstract
Random mutations increase genetic variety and natural selection enhances adaption over generations. Codon usage biases (CUB) provide clues about the genome adaptation mechanisms of native species and extremophile species. Significant numbers of gene (CDS) of nine classes of endangered, native species, including extremophiles and mesophiles were utilised to compute CUB. Codon usage patterns differ among the lineages of endangered and extremophiles with native species. Polymorphic usage of nucleotides with codon burial suggests parallelism of native species within relatively confined taxonomic groups. Utilizing the deviation pattern of CUB of endangered and native species, I present a calculation parameter to estimate the extinction risk of endangered species. Species diversity and extinction risk are both positively associated with the propensity of random mutation in CDS (Coding DNA sequence). Codon bias tenet profoundly selected and it governs to adaptive evolution of native species.
Collapse
Affiliation(s)
- Anindya Sundar Panja
- Department of Biotechnology, Molecular Informatics Laboratory, Oriental Institute of Science and Technology, Vidyasagar University, Midnapore, West Bengal 721102, India.
| |
Collapse
|
2
|
Giannì M, Antinucci M, Bertoncini S, Taglioli L, Giuliani C, Luiselli D, Risso D, Marini E, Morini G, Tofanelli S. Association between Variants of the TRPV1 Gene and Body Composition in Sub-Saharan Africans. Genes (Basel) 2024; 15:752. [PMID: 38927688 PMCID: PMC11202968 DOI: 10.3390/genes15060752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 05/28/2024] [Accepted: 06/05/2024] [Indexed: 06/28/2024] Open
Abstract
In humans, the transient receptor potential vanilloid 1 (TRPV1) gene is activated by exogenous (e.g., high temperatures, irritating compounds such as capsaicin) and endogenous (e.g., endocannabinoids, inflammatory factors, fatty acid metabolites, low pH) stimuli. It has been shown to be involved in several processes including nociception, thermosensation, and energy homeostasis. In this study, we investigated the association between TRPV1 gene variants, sensory perception (to capsaicin and PROP), and body composition (BMI and bioimpedance variables) in human populations. By comparing sequences deposited in worldwide databases, we identified two haplotype blocks (herein referred to as H1 and H2) that show strong stabilizing selection signals (MAF approaching 0.50, Tajima's D > +4.5) only in individuals with sub-Saharan African ancestry. We therefore studied the genetic variants of these two regions in 46 volunteers of sub-Saharan descent and 45 Italian volunteers (both sexes). Linear regression analyses showed significant associations between TRPV1 diplotypes and body composition, but not with capsaicin perception. Specifically, in African women carrying the H1-b and H2-b haplotypes, a higher percentage of fat mass and lower extracellular fluid retention was observed, whereas no significant association was found in men. Our results suggest the possible action of sex-driven balancing selection at the non-coding sequences of the TRPV1 gene, with adaptive effects on water balance and lipid deposition.
Collapse
Affiliation(s)
- Maddalena Giannì
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
- Department of Evolutionary Anthropology, University of Vienna, 1030 Vienna, Austria
| | - Marco Antinucci
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
- Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), 16163 Genova, Italy
| | - Stefania Bertoncini
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
| | - Luca Taglioli
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
| | - Cristina Giuliani
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, 40126 Bologna, Italy;
| | - Donata Luiselli
- Dipartimento di Beni Culturali (DBC), Università di Bologna, 48121 Ravenna, Italy;
| | - Davide Risso
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
| | - Elisabetta Marini
- Dipartimento di Scienze della Vita e dell’Ambiente, Università di Cagliari, 09042 Cagliari, Italy;
| | | | - Sergio Tofanelli
- Dipartimento di Biologia, Università di Pisa, Via Ghini 13, 56126 Pisa, Italy; (M.G.); (M.A.); (S.B.); (L.T.); (D.R.)
| |
Collapse
|
3
|
Rodrigues MF, Kern AD, Ralph PL. Shared evolutionary processes shape landscapes of genomic variation in the great apes. Genetics 2024; 226:iyae006. [PMID: 38242701 PMCID: PMC10990428 DOI: 10.1093/genetics/iyae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 10/26/2023] [Accepted: 01/03/2024] [Indexed: 01/21/2024] Open
Abstract
For at least the past 5 decades, population genetics, as a field, has worked to describe the precise balance of forces that shape patterns of variation in genomes. The problem is challenging because modeling the interactions between evolutionary processes is difficult, and different processes can impact genetic variation in similar ways. In this paper, we describe how diversity and divergence between closely related species change with time, using correlations between landscapes of genetic variation as a tool to understand the interplay between evolutionary processes. We find strong correlations between landscapes of diversity and divergence in a well-sampled set of great ape genomes, and explore how various processes such as incomplete lineage sorting, mutation rate variation, GC-biased gene conversion and selection contribute to these correlations. Through highly realistic, chromosome-scale, forward-in-time simulations, we show that the landscapes of diversity and divergence in the great apes are too well correlated to be explained via strictly neutral processes alone. Our best fitting simulation includes both deleterious and beneficial mutations in functional portions of the genome, in which 9% of fixations within those regions is driven by positive selection. This study provides a framework for modeling genetic variation in closely related species, an approach which can shed light on the complex balance of forces that have shaped genetic variation.
Collapse
Affiliation(s)
- Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
| | - Andrew D Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
| | - Peter L Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
- Department of Mathematics, University of Oregon, Eugene, OR 97403, USA
| |
Collapse
|
4
|
Peyrégne S, Slon V, Kelso J. More than a decade of genetic research on the Denisovans. Nat Rev Genet 2024; 25:83-103. [PMID: 37723347 DOI: 10.1038/s41576-023-00643-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/19/2023] [Indexed: 09/20/2023]
Abstract
Denisovans, a group of now extinct humans who lived in Eastern Eurasia in the Middle and Late Pleistocene, were first identified from DNA sequences just over a decade ago. Only ten fragmentary remains from two sites have been attributed to Denisovans based entirely on molecular information. Nevertheless, there has been great interest in using genetic data to understand Denisovans and their place in human history. From the reconstruction of a single high-quality genome, it has been possible to infer their population history, including events of admixture with other human groups. Additionally, the identification of Denisovan DNA in the genomes of present-day individuals has provided insights into the timing and routes of dispersal of ancient modern humans into Asia and Oceania, as well as the contributions of archaic DNA to the physiology of present-day people. In this Review, we synthesize more than a decade of research on Denisovans, reconcile controversies and summarize insights into their population history and phenotype. We also highlight how our growing knowledge about Denisovans has provided insights into our own evolutionary history.
Collapse
Affiliation(s)
- Stéphane Peyrégne
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany.
| | - Viviane Slon
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Anatomy and Anthropology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- The Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Janet Kelso
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany.
| |
Collapse
|
5
|
Yu Z, Li Y, Zhao S, Liu F, Zhao H, Chen ZJ. Evidence of positive selection of genetic variants associated with PCOS. Hum Reprod 2023; 38:ii57-ii68. [PMID: 37982420 DOI: 10.1093/humrep/dead106] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 03/30/2023] [Indexed: 11/21/2023] Open
Abstract
STUDY QUESTION Was polycystic ovary syndrome (PCOS), which impairs fertility and adheres to the evolutionary paradox, subject to evolutionary selection during ancestral times and did rapidly diminish in prevalence? SUMMARY ANSWER This study strengthened the hypothesis that positive selection of genetic variants occurred and may account for the high prevalence of PCOS observed today. WHAT IS KNOWN ALREADY PCOS is a complex endocrine disorder characterized by both reproductive and metabolic disturbances. As a heritable disease that impairs fertility, PCOS should diminish rapidly in prevalence; however, it is the most common cause of female subfertility globally. Few scientific genetic studies have attempted to provide evidence for the positive selection of gene variants underlying PCOS. STUDY DESIGN, SIZE, DURATION We performed an evolutionary analysis of 2,504 individuals from 14 populations of the 1000 Genomes Project. PARTICIPANTS/MATERIALS, SETTING, METHODS We tested the signature of positive selection for 37 single-nucleotide polymorphisms (SNPs) associated with PCOS in previous genome-wide association studies using six parameters of positive selection. MAIN RESULTS AND THE ROLE OF CHANCE Analyzing the evolutionary indices together, there was obvious positive selection at the PCOS-related SNPs loci, especially within the original evolution window of humans, demonstrated by significant Tajima's D values. Compared to the genome background, six of the 37 SNPs in or close to five genes (DENN domain-containing protein 1A: DENND1A, chromosome 9 open reading frame 3: AOPEP, aminopeptidase O: THADA, diacylglycerol kinase iota: DGKI, and netrin receptor UNC5C: UNC5C) showed significant evidence of positive selection, among which DENND1A, AOPEP, and THADA represent the set of most established susceptibility genes for PCOS. LIMITATIONS, REASONS FOR CAUTION First, only well-documented SNPs were selected from well-designed experiments. Second, it is difficult to determine which hypothesis of PCOS evolution is at play. After considering the most significant functions of these genes, we found that they had a wide variety of functions with no obvious association between them. WIDER IMPLICATIONS OF THE FINDINGS Our findings provide additional evidence for the positive evolution of PCOS. Our analyses require confirmation in a larger study with more evolutionary indicators and larger data range. Further research to identify the roles of the DENND1A, AOPEP, THADA, DGKI, and UNC5C genes is also necessary. STUDY FUNDING/COMPETING INTEREST(S) This study was supported by the National Key Research and Development Program of China (2021YFC2700400 and 2021YFC2700701), Basic Science Center Program of NSFC (31988101), CAMS Innovation Fund for Medical Sciences (2021-I2M-5-001), National Natural Science Foundation of China (82192874, 31871509, and 82071606), Shandong Provincial Key Research and Development Program (2020ZLYS02), Taishan Scholars Program of Shandong Province (ts20190988), and Fundamental Research Funds of Shandong University. The authors have no conflicts of interest to disclose. TRIAL REGISTRATION NUMBER N/A.
Collapse
Affiliation(s)
- Zhiheng Yu
- Hospital for Reproductive Medicine, Shandong University, Jinan, China
- Key Laboratory of Reproductive Endocrinology of Ministry of Education, Shandong University, Jinan, China
- Shandong Key Laboratory of Reproductive Medicine, Jinan, China
- National Research Center for Assisted Reproductive Technology and Reproductive Genetics, Shandong University, Jinan, China
- State Key Laboratory of Reproductive Medicine and Offspring Health, Jinan, China
| | - Yi Li
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Shigang Zhao
- Hospital for Reproductive Medicine, Shandong University, Jinan, China
- Key Laboratory of Reproductive Endocrinology of Ministry of Education, Shandong University, Jinan, China
- Shandong Key Laboratory of Reproductive Medicine, Jinan, China
- National Research Center for Assisted Reproductive Technology and Reproductive Genetics, Shandong University, Jinan, China
- State Key Laboratory of Reproductive Medicine and Offspring Health, Jinan, China
| | - Fan Liu
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Department of Forensic Science, College of Justice, Naif Arab University for Security Sciences, Riyadh, Saudi Arabia
| | - Han Zhao
- Hospital for Reproductive Medicine, Shandong University, Jinan, China
- Key Laboratory of Reproductive Endocrinology of Ministry of Education, Shandong University, Jinan, China
- Shandong Key Laboratory of Reproductive Medicine, Jinan, China
- National Research Center for Assisted Reproductive Technology and Reproductive Genetics, Shandong University, Jinan, China
- State Key Laboratory of Reproductive Medicine and Offspring Health, Jinan, China
| | - Zi-Jiang Chen
- Hospital for Reproductive Medicine, Shandong University, Jinan, China
- Key Laboratory of Reproductive Endocrinology of Ministry of Education, Shandong University, Jinan, China
- Shandong Key Laboratory of Reproductive Medicine, Jinan, China
- National Research Center for Assisted Reproductive Technology and Reproductive Genetics, Shandong University, Jinan, China
- State Key Laboratory of Reproductive Medicine and Offspring Health, Jinan, China
- Shanghai Key Laboratory for Assisted Reproduction and Reproductive Genetics, Shanghai, China
- Center for Reproductive Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
6
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
7
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. Evolution 2023; 77:2113-2127. [PMID: 37395482 PMCID: PMC10547124 DOI: 10.1093/evolut/qpad120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/15/2023] [Accepted: 06/30/2023] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
8
|
Rimbault M, Legeai F, Peccoud J, Mieuzet L, Call E, Nouhaud P, Defendini H, Mahéo F, Marande W, Théron N, Tagu D, Le Trionnaire G, Simon JC, Jaquiéry J. Contrasting Evolutionary Patterns Between Sexual and Asexual Lineages in a Genomic Region Linked to Reproductive Mode Variation in the pea aphid. Genome Biol Evol 2023; 15:evad168. [PMID: 37717171 PMCID: PMC10538257 DOI: 10.1093/gbe/evad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 09/01/2023] [Accepted: 09/12/2023] [Indexed: 09/18/2023] Open
Abstract
Although asexual lineages evolved from sexual lineages in many different taxa, the genetics of sex loss remains poorly understood. We addressed this issue in the pea aphid Acyrthosiphon pisum, whose natural populations encompass lineages performing cyclical parthenogenesis (CP) and producing one sexual generation per year, as well as obligate parthenogenetic (OP) lineages that can no longer produce sexual females but can still produce males. An SNP-based, whole-genome scan of CP and OP populations sequenced in pools (103 individuals from 6 populations) revealed that an X-linked region is associated with the variation in reproductive mode. This 840-kb region is highly divergent between CP and OP populations (FST = 34.9%), with >2,000 SNPs or short Indels showing a high degree of association with the phenotypic trait. In OP populations specifically, this region also shows reduced diversity and Tajima's D, consistent with the OP phenotype being a derived trait in aphids. Interestingly, the low genetic differentiation between CP and OP populations at the rest of the genome (FST = 2.5%) suggests gene flow between them. Males from OP lineages thus likely transmit their op allele to new genomic backgrounds. These genetic exchanges, combined with the selection of the OP and CP reproductive modes under different climates, probably contribute to the long-term persistence of the cp and op alleles.
Collapse
Affiliation(s)
- Maud Rimbault
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Fabrice Legeai
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
- University of Rennes, Inria, CNRS, IRISA, Rennes, France
| | - Jean Peccoud
- Laboratoire Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, Unité Mixte de Recherche 7267 Centre National de la Recherche Scientifique, Université de Poitiers, Poitiers CEDEX 9, France
| | - Lucie Mieuzet
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Elsa Call
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Pierre Nouhaud
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
- CBGP, INRAE, CIRAD, IRD, Montpellier SupAgro, Univ Montpellier, Montpellier, France
| | - Hélène Defendini
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Frédérique Mahéo
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - William Marande
- French Plant Genomic Resource Center, INRAE-CNRGV, Castanet Tolosan, France
| | - Nicolas Théron
- French Plant Genomic Resource Center, INRAE-CNRGV, Castanet Tolosan, France
| | - Denis Tagu
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Gaël Le Trionnaire
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Jean-Christophe Simon
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| | - Julie Jaquiéry
- INRAE, UMR 1349, Institute of Genetics, Environment and Plant Protection, Le Rheu, France
| |
Collapse
|
9
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.15.545166. [PMID: 37398347 PMCID: PMC10312679 DOI: 10.1101/2023.06.15.545166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modelled by a realistic mutation rate and as part of a realistic distribution of fitness effects (DFE), as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modelled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false positive rates are in excess of true positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong. Teaser Text Outlier-based genomic scans have proven a popular approach for identifying loci that have potentially experienced recent positive selection. However, it has previously been shown that an evolutionarily appropriate baseline model that incorporates non-equilibrium population histories, purifying and background selection, and variation in mutation and recombination rates is necessary to reduce often extreme false positive rates when performing genomic scans. Here we evaluate the power to detect recurrent selective sweeps using common SFS-based and haplotype-based methods under these increasingly realistic models. We find that while these appropriate evolutionary baselines are essential to reduce false positive rates, the power to accurately detect recurrent selective sweeps is generally low across much of the biologically relevant parameter space.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Present address: Department of Biology, Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
10
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
11
|
Árnason E, Koskela J, Halldórsdóttir K, Eldon B. Sweepstakes reproductive success via pervasive and recurrent selective sweeps. eLife 2023; 12:80781. [PMID: 36806325 PMCID: PMC9940914 DOI: 10.7554/elife.80781] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 12/28/2022] [Indexed: 02/22/2023] Open
Abstract
Highly fecund natural populations characterized by high early mortality abound, yet our knowledge about their recruitment dynamics is somewhat rudimentary. This knowledge gap has implications for our understanding of genetic variation, population connectivity, local adaptation, and the resilience of highly fecund populations. The concept of sweepstakes reproductive success, which posits a considerable variance and skew in individual reproductive output, is key to understanding the distribution of individual reproductive success. However, it still needs to be determined whether highly fecund organisms reproduce through sweepstakes and, if they do, the relative roles of neutral and selective sweepstakes. Here, we use coalescent-based statistical analysis of population genomic data to show that selective sweepstakes likely explain recruitment dynamics in the highly fecund Atlantic cod. We show that the Kingman coalescent (modelling no sweepstakes) and the Xi-Beta coalescent (modelling random sweepstakes), including complex demography and background selection, do not provide an adequate fit for the data. The Durrett-Schweinsberg coalescent, in which selective sweepstakes result from recurrent and pervasive selective sweeps of new mutations, offers greater explanatory power. Our results show that models of sweepstakes reproduction and multiple-merger coalescents are relevant and necessary for understanding genetic diversity in highly fecund natural populations. These findings have fundamental implications for understanding the recruitment variation of fish stocks and general evolutionary genomics of high-fecundity organisms.
Collapse
Affiliation(s)
- Einar Árnason
- Institute of Life- and environmental Sciences, University of IcelandReykjavikIceland,Department of Organismal and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Jere Koskela
- Department of Statistics, University of WarwickCoventryUnited Kingdom
| | - Katrín Halldórsdóttir
- Institute of Life- and environmental Sciences, University of IcelandReykjavikIceland
| | - Bjarki Eldon
- Leibniz Institute for Evolution and Biodiversity Science, Museum für NaturkundeBerlinGermany
| |
Collapse
|
12
|
Kreiner JM, Latorre SM, Burbano HA, Stinchcombe JR, Otto SP, Weigel D, Wright SI. Rapid weed adaptation and range expansion in response to agriculture over the past two centuries. Science 2022; 378:1079-1085. [PMID: 36480621 DOI: 10.1126/science.abo7293] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
North America has experienced a massive increase in cropland use since 1800, accompanied more recently by the intensification of agricultural practices. Through genome analysis of present-day and historical samples spanning environments over the past two centuries, we studied the effect of these changes in farming on the extent and tempo of evolution across the native range of the common waterhemp (Amaranthus tuberculatus), a now pervasive agricultural weed. Modern agriculture has imposed strengths of selection rarely observed in the wild, with notable shifts in allele frequency trajectories since agricultural intensification in the 1960s. An evolutionary response to this extreme selection was facilitated by a concurrent human-mediated range shift. By reshaping genome-wide diversity across the landscape, agriculture has driven the success of this weed in the 21st century.
Collapse
Affiliation(s)
- Julia M Kreiner
- Department of Botany, University of British Columbia, Vancouver, BC, Canada.,Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - Sergio M Latorre
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.,Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Hernán A Burbano
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.,Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Sarah P Otto
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada.,Department of Zoology, University of British Columbia, Vancouver, BC, Canada
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
13
|
Nikolakis ZL, Adams RH, Wade KJ, Lund AJ, Carlton EJ, Castoe TA, Pollock DD. Prospects for genomic surveillance for selection in schistosome parasites. FRONTIERS IN EPIDEMIOLOGY 2022; 2:932021. [PMID: 38455290 PMCID: PMC10910990 DOI: 10.3389/fepid.2022.932021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/12/2022] [Indexed: 03/09/2024]
Abstract
Schistosomiasis is a neglected tropical disease caused by multiple parasitic Schistosoma species, and which impacts over 200 million people globally, mainly in low- and middle-income countries. Genomic surveillance to detect evidence for natural selection in schistosome populations represents an emerging and promising approach to identify and interpret schistosome responses to ongoing control efforts or other environmental factors. Here we review how genomic variation is used to detect selection, how these approaches have been applied to schistosomes, and how future studies to detect selection may be improved. We discuss the theory of genomic analyses to detect selection, identify experimental designs for such analyses, and review studies that have applied these approaches to schistosomes. We then consider the biological characteristics of schistosomes that are expected to respond to selection, particularly those that may be impacted by control programs. Examples include drug resistance, host specificity, and life history traits, and we review our current understanding of specific genes that underlie them in schistosomes. We also discuss how inherent features of schistosome reproduction and demography pose substantial challenges for effective identification of these traits and their genomic bases. We conclude by discussing how genomic surveillance for selection should be designed to improve understanding of schistosome biology, and how the parasite changes in response to selection.
Collapse
Affiliation(s)
- Zachary L. Nikolakis
- Department of Biology, University of Texas at Arlington, Arlington, TX, United States
| | - Richard H. Adams
- Department of Biological and Environmental Sciences, Georgia College and State University, Milledgeville, GA, United States
| | - Kristen J. Wade
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Andrea J. Lund
- Department of Environmental and Occupational Health, Colorado School of Public Health, University of Colorado, Anschutz, Aurora, CO, United States
| | - Elizabeth J. Carlton
- Department of Environmental and Occupational Health, Colorado School of Public Health, University of Colorado, Anschutz, Aurora, CO, United States
| | - Todd A. Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX, United States
| | - David D. Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
14
|
Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, Keightley PD, Lynch M, McVean G, Payseur BA, Pfeifer SP, Stephan W, Jensen JD. Recommendations for improving statistical inference in population genomics. PLoS Biol 2022; 20:e3001669. [PMID: 35639797 PMCID: PMC9154105 DOI: 10.1371/journal.pbio.3001669] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Peter D. Keightley
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susanne P. Pfeifer
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
- * E-mail:
| |
Collapse
|
15
|
Kuijpers Y, Domínguez-Andrés J, Bakker OB, Gupta MK, Grasshoff M, Xu CJ, Joosten LAB, Bertranpetit J, Netea MG, Li Y. Evolutionary Trajectories of Complex Traits in European Populations of Modern Humans. Front Genet 2022; 13:833190. [PMID: 35419030 PMCID: PMC8995853 DOI: 10.3389/fgene.2022.833190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 03/11/2022] [Indexed: 11/14/2022] Open
Abstract
Humans have a great diversity in phenotypes, influenced by genetic, environmental, nutritional, cultural, and social factors. Understanding the historical trends of physiological traits can shed light on human physiology, as well as elucidate the factors that influence human diseases. Here we built genome-wide polygenic scores for heritable traits, including height, body mass index, lipoprotein concentrations, cardiovascular disease, and intelligence, using summary statistics of genome-wide association studies in Europeans. Subsequently, we applied these scores to the genomes of ancient European populations. Our results revealed that after the Neolithic, European populations experienced an increase in height and intelligence scores, decreased their skin pigmentation, while the risk for coronary artery disease increased through a genetic trajectory favoring low HDL concentrations. These results are a reflection of the continuous evolutionary processes in humans and highlight the impact that the Neolithic revolution had on our lifestyle and health.
Collapse
Affiliation(s)
- Yunus Kuijpers
- Centre for Individualised Infection Medicine, CiiM, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,TWINCORE, Centre for Experimental and Clinical Infection Research, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany
| | - Jorge Domínguez-Andrés
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Nijmegen Medical Centre, Nijmegen, Netherlands.,Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands
| | - Olivier B Bakker
- Department of Genetics, University Medical Centre Groningen, Nijmegen, Netherlands
| | - Manoj Kumar Gupta
- Centre for Individualised Infection Medicine, CiiM, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,TWINCORE, Centre for Experimental and Clinical Infection Research, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany
| | - Martin Grasshoff
- Centre for Individualised Infection Medicine, CiiM, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,TWINCORE, Centre for Experimental and Clinical Infection Research, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany
| | - Cheng-Jian Xu
- Centre for Individualised Infection Medicine, CiiM, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,TWINCORE, Centre for Experimental and Clinical Infection Research, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Nijmegen Medical Centre, Nijmegen, Netherlands
| | - Leo A B Joosten
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Nijmegen Medical Centre, Nijmegen, Netherlands.,Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands
| | - Jaume Bertranpetit
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona, Spain
| | - Mihai G Netea
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Nijmegen Medical Centre, Nijmegen, Netherlands.,Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands.,Department for Genomics and Immunoregulation, Life and Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - Yang Li
- Centre for Individualised Infection Medicine, CiiM, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,TWINCORE, Centre for Experimental and Clinical Infection Research, A Joint Venture Between the Hannover Medical School and the Helmholtz Centre for Infection Research, Hannover, Germany.,Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Nijmegen Medical Centre, Nijmegen, Netherlands.,Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
16
|
Cuadros-Espinoza S, Laval G, Quintana-Murci L, Patin E. The genomic signatures of natural selection in admixed human populations. Am J Hum Genet 2022; 109:710-726. [PMID: 35259336 DOI: 10.1016/j.ajhg.2022.02.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 02/14/2022] [Indexed: 12/15/2022] Open
Abstract
Admixture has been a pervasive phenomenon in human history, extensively shaping the patterns of population genetic diversity. There is increasing evidence to suggest that admixture can also facilitate genetic adaptation to local environments, i.e., admixed populations acquire beneficial mutations from source populations, a process that we refer to as "adaptive admixture." However, the role of adaptive admixture in human evolution and the power to detect it remain poorly characterized. Here, we use extensive computer simulations to evaluate the power of several neutrality statistics to detect natural selection in the admixed population, assuming multiple admixture scenarios. We show that statistics based on admixture proportions, Fadm and LAD, show high power to detect mutations that are beneficial in the admixed population, whereas other statistics, including iHS and FST, falsely detect neutral mutations that have been selected in the source populations only. By combining Fadm and LAD into a single, powerful statistic, we scanned the genomes of 15 worldwide, admixed populations for signatures of adaptive admixture. We confirm that lactase persistence and resistance to malaria have been under adaptive admixture in West Africans and in Malagasy, North Africans, and South Asians, respectively. Our approach also uncovers other cases of adaptive admixture, including APOL1 in Fulani nomads and PKN2 in East Indonesians, involved in resistance to infection and metabolism, respectively. Collectively, our study provides evidence that adaptive admixture has occurred in human populations whose genetic history is characterized by periods of isolation and spatial expansions resulting in increased gene flow.
Collapse
|
17
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software. Identifying regions of the genome that contain adaptive variation is of fundamental interest in evolutionary biology, providing insight into an organism’s history and biology. When positive selection is recent or ongoing, we expect to find genomic patterns such as high frequency haplotypes and low genetic diversity in the vicinity of the adaptive locus. Here we develop a statistic to identify these regions based on distortions of the haplotype frequency spectrum from a background distribution. We evaluate the performance of this statistic under numerous realistic settings of interest to empiricists and demonstrate its superior performance relative to other haplotype-based selection statistics. We also apply this statistic to real population-genetic data. As a positive control, we explore two well-studied loci, LCT and MHC, in a European and an African human population that show strong evidence for selection. We also apply this statistic to the genomes of an urban brown rat population, where we uncover evidence for adaptation in olfactory perception genes. We release user-friendly software implementing this statistic.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
- * E-mail: (MD); (ZAS)
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail: (MD); (ZAS)
| |
Collapse
|
18
|
Johri P, Stephan W, Jensen JD. Soft selective sweeps: Addressing new definitions, evaluating competing models, and interpreting empirical outliers. PLoS Genet 2022; 18:e1010022. [PMID: 35202407 PMCID: PMC8870509 DOI: 10.1371/journal.pgen.1010022] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The ability to accurately identify and quantify genetic signatures associated with soft selective sweeps based on patterns of nucleotide variation has remained controversial. We here provide counter viewpoints to recent publications in PLOS Genetics that have argued not only for the statistical identifiability of soft selective sweeps, but also for their pervasive evolutionary role in both Drosophila and HIV populations. We present evidence that these claims owe to a lack of consideration of competing evolutionary models, unjustified interpretations of empirical outliers, as well as to new definitions of the processes themselves. Our results highlight the dangers of fitting evolutionary models based on hypothesized and episodic processes without properly first considering common processes and, more generally, of the tendency in certain research areas to view pervasive positive selection as a foregone conclusion.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
19
|
Novo I, Santiago E, Caballero A. The estimates of effective population size based on linkage disequilibrium are virtually unaffected by natural selection. PLoS Genet 2022; 18:e1009764. [PMID: 35077457 PMCID: PMC8815936 DOI: 10.1371/journal.pgen.1009764] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 02/04/2022] [Accepted: 12/21/2021] [Indexed: 11/19/2022] Open
Abstract
The effective population size (Ne) is a key parameter to quantify the magnitude of genetic drift and inbreeding, with important implications in human evolution. The increasing availability of high-density genetic markers allows the estimation of historical changes in Ne across time using measures of genome diversity or linkage disequilibrium between markers. Directional selection is expected to reduce diversity and Ne, and this reduction is modulated by the heterogeneity of the genome in terms of recombination rate. Here we investigate by computer simulations the consequences of selection (both positive and negative) and recombination rate heterogeneity in the estimation of historical Ne. We also investigate the relationship between diversity parameters and Ne across the different regions of the genome using human marker data. We show that the estimates of historical Ne obtained from linkage disequilibrium between markers (NeLD) are virtually unaffected by selection. In contrast, those estimates obtained by coalescence mutation-recombination-based methods can be strongly affected by it, which could have important consequences for the estimation of human demography. The simulation results are supported by the analysis of human data. The estimates of NeLD obtained for particular genomic regions do not correlate, or they do it very weakly, with recombination rate, nucleotide diversity, proportion of polymorphic sites, background selection statistic, minor allele frequency of SNPs, loss of function and missense variants and gene density. This suggests that NeLD measures mainly reflect demographic changes in population size across generations. The inference of the demographic history of populations is of great relevance in evolutionary biology. This inference can be made from genomic data using coalescence methods or linkage disequilibrium methods. However, the assessment of these methods is usually made assuming neutrality (absence of selection). Here we show by computer simulations and analyses of human data that the estimates of historical effective population size obtained from linkage disequilibrium between markers are virtually unaffected by natural selection, either positive or negative. In contrast, estimates obtained by coalescence mutation-recombination-based methods can be strongly affected by it, which could have important consequences for recent estimations of human demography.
Collapse
Affiliation(s)
- Irene Novo
- Centro de Investigación Mariña, Universidade de Vigo, Facultade de Bioloxía, Vigo, Spain
- * E-mail:
| | - Enrique Santiago
- Departamento de Biología Funcional, Facultad de Biología, Universidad de Oviedo, Oviedo, Spain
| | - Armando Caballero
- Centro de Investigación Mariña, Universidade de Vigo, Facultade de Bioloxía, Vigo, Spain
| |
Collapse
|
20
|
Laval G, Patin E, Boutillier P, Quintana-Murci L. Sporadic occurrence of recent selective sweeps from standing variation in humans as revealed by an approximate Bayesian computation approach. Genetics 2021; 219:6377789. [PMID: 34849862 DOI: 10.1093/genetics/iyab161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 09/01/2021] [Indexed: 12/14/2022] Open
Abstract
During their dispersals over the last 100,000 years, modern humans have been exposed to a large variety of environments, resulting in genetic adaptation. While genome-wide scans for the footprints of positive Darwinian selection have increased knowledge of genes and functions potentially involved in human local adaptation, they have globally produced evidence of a limited contribution of selective sweeps in humans. Conversely, studies based on machine learning algorithms suggest that recent sweeps from standing variation are widespread in humans, an observation that has been recently questioned. Here, we sought to formally quantify the number of recent selective sweeps in humans, by leveraging approximate Bayesian computation and whole-genome sequence data. Our computer simulations revealed suitable ABC estimations, regardless of the frequency of the selected alleles at the onset of selection and the completion of sweeps. Under a model of recent selection from standing variation, we inferred that an average of 68 (from 56 to 79) and 140 (from 94 to 198) sweeps occurred over the last 100,000 years of human history, in African and Eurasian populations, respectively. The former estimation is compatible with human adaptation rates estimated since divergence with chimps, and reveals numbers of sweeps per generation per site in the range of values estimated in Drosophila. Our results confirm the rarity of selective sweeps in humans and show a low contribution of sweeps from standing variation to recent human adaptation.
Collapse
Affiliation(s)
- Guillaume Laval
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Pierre Boutillier
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France.,Human Genomics and Evolution, Collège de France, 75005 Paris, France
| |
Collapse
|
21
|
Griffiths JS, Johnson KM, Kelly MW. Evolutionary Change in the Eastern Oyster, Crassostrea Virginica, Following Low Salinity Exposure. Integr Comp Biol 2021; 61:1730-1740. [PMID: 34448845 DOI: 10.1093/icb/icab185] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The presence of standing genetic variation will play a role in determining a population's capacity to adapt to environmentally relevant stressors. In the Gulf of Mexico, extreme climatic events and anthropogenic changes to local hydrology will expose productive oyster breeding grounds to stressful low salinity conditions. We identified genetic variation for performance under low salinity (due to the combined effects of low salinity and genetic load) using a single-generation selection experiment on larvae from two populations of the eastern oyster, Crassostrea virginica. We used pool-sequencing to test for allele frequency differences at 152 salinity-associated genes for larval families pre- and post-low salinity exposure. Our results have implications for how evolutionary change occurs during early life history stages at environmentally relevant salinities. Consistent with observations of high genetic load observed in oysters, we demonstrate evidence for purging of deleterious alleles at the larval stage in C. virginica. In addition, we observe increases in allele frequencies at multiple loci, suggesting that natural selection for low salinity performance at the larval stage can act as a filter for genotypes found in adult populations.
Collapse
Affiliation(s)
- Joanna S Griffiths
- Department of Environmental Toxicology and Department of Wildlife, Fish, and Conservation Biology, University of California, Davis, CA 95616, USA
| | - Kevin M Johnson
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA.,California Sea Grant, University of California San Diego, La Jolla, CA 92093, USA
| | - Morgan W Kelly
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
22
|
Charlesworth B, Jensen JD. Effects of Selection at Linked Sites on Patterns of Genetic Variability. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021; 52:177-197. [PMID: 37089401 PMCID: PMC10120885 DOI: 10.1146/annurev-ecolsys-010621-044528] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Patterns of variation and evolution at a given site in a genome can be strongly influenced by the effects of selection at genetically linked sites. In particular, the recombination rates of genomic regions correlate with their amount of within-population genetic variability, the degree to which the frequency distributions of DNA sequence variants differ from their neutral expectations, and the levels of adaptation of their functional components. We review the major population genetic processes that are thought to lead to these patterns, focusing on their effects on patterns of variability: selective sweeps, background selection, associative overdominance, and Hill–Robertson interference among deleterious mutations. We emphasize the difficulties in distinguishing among the footprints of these processes and disentangling them from the effects of purely demographic factors such as population size changes. We also discuss how interactions between selective and demographic processes can significantly affect patterns of variability within genomes.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona 85281, USA
| |
Collapse
|
23
|
Luqman H, Widmer A, Fior S, Wegmann D. Identifying loci under selection via explicit demographic models. Mol Ecol Resour 2021; 21:2719-2737. [PMID: 33964107 PMCID: PMC8596768 DOI: 10.1111/1755-0998.13415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 04/03/2021] [Accepted: 04/28/2021] [Indexed: 01/28/2023]
Abstract
Adaptive genetic variation is a function of both selective and neutral forces. To accurately identify adaptive loci, it is thus critical to account for demographic history. Theory suggests that signatures of selection can be inferred using the coalescent, following the premise that genealogies of selected loci deviate from neutral expectations. Here, we build on this theory to develop an analytical framework to identify loci under selection via explicit demographic models (LSD). Under this framework, signatures of selection are inferred through deviations in demographic parameters, rather than through summary statistics directly, and demographic history is accounted for explicitly. Leveraging the property of demographic models to incorporate directionality, we show that LSD can provide information on the environment in which selection acts on a population. This can prove useful in elucidating the selective processes underlying local adaptation, by characterizing genetic trade-offs and extending the concepts of antagonistic pleiotropy and conditional neutrality from ecological theory to practical application in genomic data. We implement LSD via approximate Bayesian computation and demonstrate, via simulations, that LSD (a) has high power to identify selected loci across a large range of demographic-selection regimes, (b) outperforms commonly applied genome-scan methods under complex demographies and (c) accurately infers the directionality of selection for identified candidates. Using the same simulations, we further characterize the behaviour of isolation-with-migration models conducive to the study of local adaptation under regimes of selection. Finally, we demonstrate an application of LSD by detecting loci and characterizing genetic trade-offs underlying flower colour in Antirrhinum majus.
Collapse
Affiliation(s)
- Hirzi Luqman
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Alex Widmer
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Simone Fior
- Institute of Integrative BiologyETH ZurichZürichSwitzerland
| | - Daniel Wegmann
- Department of BiologyUniversity of FribourgFribourgSwitzerland
- Swiss Institute of BioinformaticsFribourgSwitzerland
| |
Collapse
|
24
|
Johri P, Charlesworth B, Howell EK, Lynch M, Jensen JD. Revisiting the Notion of Deleterious Sweeps. Genetics 2021; 219:6298596. [PMID: 34125884 PMCID: PMC9101445 DOI: 10.1093/genetics/iyab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022] Open
Abstract
It has previously been shown that, conditional on its fixation, the time to fixation of a semi-dominant deleterious autosomal mutation in a randomly mating population is the same as that of an advantageous mutation. This result implies that deleterious mutations could generate selective sweep-like effects. Although their fixation probabilities greatly differ, the much larger input of deleterious relative to beneficial mutations suggests that this phenomenon could be important. We here examine how the fixation of mildly deleterious mutations affects levels and patterns of polymorphism at linked sites - both in the presence and absence of interference amongst deleterious mutations - and how this class of sites may contribute to divergence between-populations and species. We find that, while deleterious fixations are unlikely to represent a significant proportion of outliers in polymorphism-based genomic scans within populations, minor shifts in the frequencies of deleterious mutations can influence the proportions of private variants and the value of FST after a recent population split. As sites subject to deleterious mutations are necessarily found in functional genomic regions, interpretations in terms of recurrent positive selection may require reconsideration.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Emma K Howell
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States.,Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| |
Collapse
|
25
|
Terasaki Hart DE, Bishop AP, Wang IJ. Geonomics: forward-time, spatially explicit, and arbitrarily complex landscape genomic simulations. Mol Biol Evol 2021; 38:4634-4646. [PMID: 34117771 PMCID: PMC8476160 DOI: 10.1093/molbev/msab175] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Understanding the drivers of spatial patterns of genomic diversity has emerged as a major
goal of evolutionary genetics. The flexibility of forward-time simulation makes it
especially valuable for these efforts, allowing for the simulation of arbitrarily complex
scenarios in a way that mimics how real populations evolve. Here, we present Geonomics, a
Python package for performing complex, spatially explicit, landscape genomic simulations
with full spatial pedigrees that dramatically reduces user workload yet remains
customizable and extensible because it is embedded within a popular, general-purpose
language. We show that Geonomics results are consistent with expectations for a variety of
validation tests based on classic models in population genetics and then demonstrate its
utility and flexibility with a trio of more complex simulation scenarios that feature
polygenic selection, selection on multiple traits, simulation on complex landscapes, and
nonstationary environmental change. We then discuss runtime, which is primarily sensitive
to landscape raster size, memory usage, which is primarily sensitive to maximum population
size and recombination rate, and other caveats related to the model’s methods for
approximating recombination and movement. Taken together, our tests and demonstrations
show that Geonomics provides an efficient and robust platform for population genomic
simulations that capture complex spatial and evolutionary dynamics.
Collapse
Affiliation(s)
- Drew E Terasaki Hart
- Department of Environmental Science, Policy, and Management, College of Natural Resources, University of California, Berkeley, CA, 94720, USA
| | - Anusha P Bishop
- Department of Environmental Science, Policy, and Management, College of Natural Resources, University of California, Berkeley, CA, 94720, USA
| | - Ian J Wang
- Department of Environmental Science, Policy, and Management, College of Natural Resources, University of California, Berkeley, CA, 94720, USA
| |
Collapse
|
26
|
Ma F, Lau CY, Zheng C. Large genetic diversity and strong positive selection in F-box and GPCR genes among the wild isolates of Caenorhabditis elegans. Genome Biol Evol 2021; 13:6163285. [PMID: 33693740 PMCID: PMC8120010 DOI: 10.1093/gbe/evab048] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 02/17/2021] [Accepted: 03/03/2021] [Indexed: 01/05/2023] Open
Abstract
The F-box and chemosensory GPCR (csGPCR) gene families are greatly expanded in nematodes, including the model organism Caenorhabditis elegans, compared with insects and vertebrates. However, the intraspecific evolution of these two gene families in nematodes remain unexamined. In this study, we analyzed the genomic sequences of 330 recently sequenced wild isolates of C. elegans using a range of population genetics approaches. We found that F-box and csGPCR genes, especially the Srw family csGPCRs, showed much more diversity than other gene families. Population structure analysis and phylogenetic analysis divided the wild strains into eight non-Hawaiian and three Hawaiian subpopulations. Some Hawaiian strains appeared to be more ancestral than all other strains. F-box and csGPCR genes maintained a great amount of the ancestral variants in the Hawaiian subpopulation and their divergence among the non-Hawaiian subpopulations contributed significantly to population structure. F-box genes are mostly located at the chromosomal arms and high recombination rate correlates with their large polymorphism. Moreover, using both neutrality tests and extended haplotype homozygosity analysis, we identified signatures of strong positive selection in the F-box and csGPCR genes among the wild isolates, especially in the non-Hawaiian population. Accumulation of high-frequency-derived alleles in these genes was found in non-Hawaiian population, leading to divergence from the ancestral genotype. In summary, we found that F-box and csGPCR genes harbor a large pool of natural variants, which may be subjected to positive selection. These variants are mostly mapped to the substrate-recognition domains of F-box proteins and the extracellular and intracellular regions of csGPCRs, possibly resulting in advantages during adaptation by affecting protein degradation and the sensing of environmental cues, respectively.
Collapse
Affiliation(s)
- Fuqiang Ma
- School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Chun Yin Lau
- School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Chaogu Zheng
- School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
27
|
Abstract
Genealogical tree modeling is essential for estimating evolutionary parameters in population genetics and phylogenetics. Recent mathematical results concerning ranked genealogies without leaf labels unlock opportunities in the analysis of evolutionary trees. In particular, comparisons between ranked genealogies facilitate the study of evolutionary processes of different organisms sampled at multiple time periods. We propose metrics on ranked tree shapes and ranked genealogies for lineages isochronously and heterochronously sampled. Our proposed tree metrics make it possible to conduct statistical analyses of ranked tree shapes and timed ranked tree shapes or ranked genealogies. Such analyses allow us to assess differences in tree distributions, quantify estimation uncertainty, and summarize tree distributions. We show the utility of our metrics via simulations and an application in infectious diseases.
Collapse
Affiliation(s)
- Jaehee Kim
- Department of Biology, Stanford University, Stanford, CA 94305
| | | | - Julia A Palacios
- Department of Statistics, Stanford University, Stanford, CA 94305;
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA 94305
| |
Collapse
|
28
|
Genomic islands of differentiation in a rapid avian radiation have been driven by recent selective sweeps. Proc Natl Acad Sci U S A 2020; 117:30554-30565. [PMID: 33199636 DOI: 10.1073/pnas.2015987117] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Numerous studies of emerging species have identified genomic "islands" of elevated differentiation against a background of relative homogeneity. The causes of these islands remain unclear, however, with some signs pointing toward "speciation genes" that locally restrict gene flow and others suggesting selective sweeps that have occurred within nascent species after speciation. Here, we examine this question through the lens of genome sequence data for five species of southern capuchino seedeaters, finch-like birds from South America that have undergone a species radiation during the last ∼50,000 generations. By applying newly developed statistical methods for ancestral recombination graph inference and machine-learning methods for the prediction of selective sweeps, we show that previously identified islands of differentiation in these birds appear to be generally associated with relatively recent, species-specific selective sweeps, most of which are predicted to be soft sweeps acting on standing genetic variation. Many of these sweeps coincide with genes associated with melanin-based variation in plumage, suggesting a prominent role for sexual selection. At the same time, a few loci also exhibit indications of possible selection against gene flow. These observations shed light on the complex manner in which natural selection shapes genome sequences during speciation.
Collapse
|
29
|
Cooke I, Ying H, Forêt S, Bongaerts P, Strugnell JM, Simakov O, Zhang J, Field MA, Rodriguez-Lanetty M, Bell SC, Bourne DG, van Oppen MJ, Ragan MA, Miller DJ. Genomic signatures in the coral holobiont reveal host adaptations driven by Holocene climate change and reef specific symbionts. SCIENCE ADVANCES 2020; 6:6/48/eabc6318. [PMID: 33246955 PMCID: PMC7695477 DOI: 10.1126/sciadv.abc6318] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 10/15/2020] [Indexed: 05/24/2023]
Abstract
Genetic signatures caused by demographic and adaptive processes during past climatic shifts can inform predictions of species' responses to anthropogenic climate change. To identify these signatures in Acropora tenuis, a reef-building coral threatened by global warming, we first assembled the genome from long reads and then used shallow whole-genome resequencing of 150 colonies from the central inshore Great Barrier Reef to inform population genomic analyses. We identify population structure in the host that reflects a Pleistocene split, whereas photosymbiont differences between reefs most likely reflect contemporary (Holocene) conditions. Signatures of selection in the host were associated with genes linked to diverse processes including osmotic regulation, skeletal development, and the establishment and maintenance of symbiosis. Our results suggest that adaptation to post-glacial climate change in A. tenuis has involved selection on many genes, while differences in symbiont specificity between reefs appear to be unrelated to host population structure.
Collapse
Affiliation(s)
- Ira Cooke
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
| | - Hua Ying
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Sylvain Forêt
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- ARC Centre of Excellence for Coral Reef Studies, Australian National University, Canberra, ACT, Australia
| | - Pim Bongaerts
- California Academy of Sciences, Golden Gate Park, San Francisco, CA, USA
| | - Jan M Strugnell
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, Queensland, Australia
- Department of Ecology, Environment and Evolution, School of Life Sciences, La Trobe University, Melbourne, Australia
- College of Science and Engineering, James Cook University, Townsville, Queensland, Australia
| | - Oleg Simakov
- Department of Molecular Evolution and Development, University of Vienna, Austria
| | - Jia Zhang
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
| | - Matt A Field
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, Queensland, Australia
| | - Mauricio Rodriguez-Lanetty
- Institute of Environment and Department of Biological Sciences, Florida International University, Miami, Fl 33199, USA
| | - Sara C Bell
- Australian Institute of Marine Science, Townsville, Queensland, Australia
| | - David G Bourne
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- College of Science and Engineering, James Cook University, Townsville, Queensland, Australia
- Australian Institute of Marine Science, Townsville, Queensland, Australia
| | - Madeleine Jh van Oppen
- Australian Institute of Marine Science, Townsville, Queensland, Australia
- School of BioSciences, University of Melbourne, Melbourne, Australia
| | - Mark A Ragan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - David J Miller
- College of Public Health, Medical and Veterinary Sciences, James Cook University, Townsville, Queensland, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Townsville, Queensland, Australia
- ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
| |
Collapse
|
30
|
Yamashita H, Uchida T, Tanaka Y, Katai H, Nagano AJ, Morita A, Ikka T. Genomic predictions and genome-wide association studies based on RAD-seq of quality-related metabolites for the genomics-assisted breeding of tea plants. Sci Rep 2020; 10:17480. [PMID: 33060786 PMCID: PMC7562905 DOI: 10.1038/s41598-020-74623-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 09/14/2020] [Indexed: 12/01/2022] Open
Abstract
Effectively using genomic information greatly accelerates conventional breeding and applying it to long-lived crops promotes the conversion to genomic breeding. Because tea plants are bred using conventional methods, we evaluated the potential of genomic predictions (GPs) and genome-wide association studies (GWASs) for the genetic breeding of tea quality-related metabolites using genome-wide single nucleotide polymorphisms (SNPs) detected from restriction site-associated DNA sequencing of 150 tea accessions. The present GP, based on genome-wide SNPs, and six models produced moderate prediction accuracy values (r) for the levels of most catechins, represented by ( -)-epigallocatechin gallate (r = 0.32-0.41) and caffeine (r = 0.44-0.51), but low r values for free amino acids and chlorophylls. Integrated analysis of GWAS and GP detected potential candidate genes for each metabolite using 80-160 top-ranked SNPs that resulted in the maximum cumulative prediction value. Applying GPs and GWASs to tea accession traits will contribute to genomics-assisted tea breeding.
Collapse
Affiliation(s)
- Hiroto Yamashita
- Faculty of Agriculture, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka, 422-8529, Japan
- United Graduate School of Agricultural Science, Gifu University, 1-1 Yanagito, Gifu, 501-1193, Japan
| | - Tomoki Uchida
- Faculty of Agriculture, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka, 422-8529, Japan
| | - Yasuno Tanaka
- Faculty of Agriculture, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka, 422-8529, Japan
- United Graduate School of Agricultural Science, Gifu University, 1-1 Yanagito, Gifu, 501-1193, Japan
| | - Hideyuki Katai
- Shizuoka Prefectural Research Institute of Agriculture and Forestry, Tea Research Center, 1706-11 Kurasawa, Kikugawa, Shizuoka, 439-0002, Japan
- Shizuoka Prefecture Chubu Agriculture and Forestry Office, 2-20 Ariake-cho, Suruga-ku, Shizuoka, 422-8031, Japan
| | - Atsushi J Nagano
- Faculty of Agriculture, Ryukoku University, 1-5 Yokotani, Seta Oe-cho, Otsu, Shiga, 520-2194, Japan
| | - Akio Morita
- Faculty of Agriculture, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka, 422-8529, Japan
- Institute for Tea Science, Shizuoka University, 836 Ohya, Shizuoka, 422-8529, Japan
| | - Takashi Ikka
- Faculty of Agriculture, Shizuoka University, 836 Ohya, Suruga-ku, Shizuoka, 422-8529, Japan.
- Institute for Tea Science, Shizuoka University, 836 Ohya, Shizuoka, 422-8529, Japan.
| |
Collapse
|
31
|
Marchi N, Excoffier L. Gene flow as a simple cause for an excess of high-frequency-derived alleles. Evol Appl 2020; 13:2254-2263. [PMID: 33005222 PMCID: PMC7513730 DOI: 10.1111/eva.12998] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 04/30/2020] [Accepted: 05/04/2020] [Indexed: 01/19/2023] Open
Abstract
Most human populations exhibit an excess of high-frequency variants, leading to a U-shaped site-frequency spectrum (uSFS). This pattern has been generally interpreted as a signature of ongoing episodes of positive selection, or as evidence for a mis-assignment of ancestral/derived allelic states, but uSFS has also been observed in populations receiving gene flow from a ghost population, in structured populations, or after range expansions. In order to better explain the prevalence of high-frequency variants in humans and other populations, we describe here which patterns of gene flow and population demography can lead to uSFS by using extensive coalescent simulations. We find that uSFS can often be observed in a population if gene flow brings a few ancestral alleles from a well-differentiated population. Gene flow can either consist in single pulses of admixture or continuous immigration, but different demographic conditions are necessary to observe uSFS in these two scenarios. Indeed, an extremely low and recent gene flow is required in the case of single admixture events, while with continuous immigration, uSFS occurs only if gene flow started recently at a high rate or if it lasted for a long time at a low rate. Overall, we find that a neutral uSFS occurs under more restrictive conditions in populations having received single pulses of gene flow than in populations exposed to continuous gene flow. We also show that the uSFS observed in human populations from the 1000 Genomes Project can easily be explained by gene flow from surrounding populations without requiring past episodes of positive selection. These results imply that uSFS should be common in non-isolated populations, such as most wild or domesticated plants and animals.
Collapse
Affiliation(s)
- Nina Marchi
- CMPGInstitute of Ecology and EvolutionUniversity of BerneBerneSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Laurent Excoffier
- CMPGInstitute of Ecology and EvolutionUniversity of BerneBerneSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| |
Collapse
|
32
|
Werren EA, Garcia O, Bigham AW. Identifying adaptive alleles in the human genome: from selection mapping to functional validation. Hum Genet 2020; 140:241-276. [PMID: 32728809 DOI: 10.1007/s00439-020-02206-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Accepted: 07/07/2020] [Indexed: 12/19/2022]
Abstract
The suite of phenotypic diversity across geographically distributed human populations is the outcome of genetic drift, gene flow, and natural selection throughout human evolution. Human genetic variation underlying local biological adaptations to selective pressures is incompletely characterized. With the emergence of population genetics modeling of large-scale genomic data derived from diverse populations, scientists are able to map signatures of natural selection in the genome in a process known as selection mapping. Inferred selection signals further can be used to identify candidate functional alleles that underlie putative adaptive phenotypes. Phenotypic association, fine mapping, and functional experiments facilitate the identification of candidate adaptive alleles. Functional investigation of candidate adaptive variation using novel techniques in molecular biology is slowly beginning to unravel how selection signals translate to changes in biology that underlie the phenotypic spectrum of our species. In addition to informing evolutionary hypotheses of adaptation, the discovery and functional annotation of adaptive alleles also may be of clinical significance. While selection mapping efforts in non-European populations are growing, there remains a stark under-representation of diverse human populations in current public genomic databases, of both clinical and non-clinical cohorts. This lack of inclusion limits the study of human biological variation. Identifying and functionally validating candidate adaptive alleles in more global populations is necessary for understanding basic human biology and human disease.
Collapse
Affiliation(s)
- Elizabeth A Werren
- Department of Human Genetics, The University of Michigan, Ann Arbor, MI, USA
- Department of Anthropology, The University of Michigan, Ann Arbor, MI, USA
| | - Obed Garcia
- Department of Anthropology, The University of Michigan, Ann Arbor, MI, USA
| | - Abigail W Bigham
- Department of Anthropology, University of California Los Angeles, 341 Haines Hall, Los Angeles, CA, 90095, USA.
| |
Collapse
|
33
|
Iwasaki RL, Ishiya K, Kanzawa-Kiriyama H, Kawai Y, Gojobori J, Satta Y. Evolutionary History of the Risk of SNPs for Diffuse-Type Gastric Cancer in the Japanese Population. Genes (Basel) 2020; 11:genes11070775. [PMID: 32664326 PMCID: PMC7396988 DOI: 10.3390/genes11070775] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 07/02/2020] [Accepted: 07/08/2020] [Indexed: 12/24/2022] Open
Abstract
A genome wide association study reported that the T allele of rs2294008 in a cancer-related gene, PSCA, is a risk allele for diffuse-type gastric cancer. This allele has the highest frequency (0.63) in Japanese in Tokyo (JPT) among 26 populations in the 1000 Genomes Project database. FST ≈ 0.26 at this single nucleotide polymorphism is one of the highest between JPT and the genetically close Han Chinese in Beijing (CHB). To understand the evolutionary history of the alleles in PSCA, we addressed: (i) whether the C non-risk allele at rs2294008 is under positive selection, and (ii) why the mainland Japanese population has a higher T allele frequency than other populations. We found that haplotypes harboring the C allele are composed of two subhaplotypes. We detected that positive selection on both subhaplotypes has occurred in the East Asian lineage. However, the selection on one of the subhaplotypes in JPT seems to have been relaxed or ceased after divergence from the continental population; this may have caused the elevation of T allele frequency. Based on simulations under the dual structure model (a specific demography for the Japanese) and phylogenetic analysis with ancient DNA, the T allele at rs2294008 might have had high frequency in the Jomon people (one of the ancestral populations of the modern Japanese); this may explain the high T allele frequency in the extant Japanese.
Collapse
Affiliation(s)
- Risa L. Iwasaki
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Kanagawa 240-0193, Japan; (R.L.I.); (J.G.)
| | - Koji Ishiya
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Sapporo 062-8517, Japan;
| | | | - Yosuke Kawai
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo 162-8655, Japan;
| | - Jun Gojobori
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Kanagawa 240-0193, Japan; (R.L.I.); (J.G.)
| | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Kanagawa 240-0193, Japan; (R.L.I.); (J.G.)
- Correspondence: ; Tel.: +81-46-858-1574
| |
Collapse
|
34
|
Osmond MM, Coop G. Genetic Signatures of Evolutionary Rescue by a Selective Sweep. Genetics 2020; 215:813-829. [PMID: 32398227 PMCID: PMC7337082 DOI: 10.1534/genetics.120.303173] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Accepted: 05/06/2020] [Indexed: 12/31/2022] Open
Abstract
One of the most useful models in population genetics is that of a selective sweep and the consequent hitch-hiking of linked neutral alleles. While variations on this model typically assume constant population size, many instances of strong selection and rapid adaptation in nature may co-occur with complex demography. Here, we extend the hitch-hiking model to evolutionary rescue, where adaptation and demography not only co-occur but are intimately entwined. Our results show how this feedback between demography and evolution determines-and restricts-the genetic signatures of evolutionary rescue, and how these differ from the signatures of sweeps in populations of constant size. In particular, we find rescue to harden sweeps from standing variance or new mutation (but not from migration), reduce genetic diversity both at the selected site and genome-wide, and increase the range of observed Tajima's D values. For a given initial rate of population decline, the feedback between demography and evolution makes all of these differences more dramatic under weaker selection, where bottlenecks are prolonged. Nevertheless, it is likely difficult to infer the co-incident timing of the sweep and bottleneck from these simple signatures, never mind a feedback between them. Temporal samples spanning contemporary rescue events may offer one way forward.
Collapse
Affiliation(s)
- Matthew M Osmond
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Graham Coop
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, California 95616
| |
Collapse
|
35
|
Harris AM, DeGiorgio M. Identifying and Classifying Shared Selective Sweeps from Multilocus Data. Genetics 2020; 215:143-171. [PMID: 32152048 PMCID: PMC7198270 DOI: 10.1534/genetics.120.303137] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 02/29/2020] [Indexed: 11/18/2022] Open
Abstract
Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term [Formula: see text] and [Formula: see text] to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
- Molecular, Cellular, and Integrative Biosciences at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431
| |
Collapse
|
36
|
Abstract
Cognitive abilities can vary dramatically among species. The relative importance of social and ecological challenges in shaping cognitive evolution has been the subject of a long-running and recently renewed debate, but little work has sought to understand the selective dynamics underlying the evolution of cognitive abilities. Here, we investigate recent selection related to cognition in the paper wasp Polistes fuscatus-a wasp that has uniquely evolved visual individual recognition abilities. We generate high quality de novo genome assemblies and population genomic resources for multiple species of paper wasps and use a population genomic framework to interrogate the probable mode and tempo of cognitive evolution. Recent, strong, hard selective sweeps in P. fuscatus contain loci annotated with functions in long-term memory formation, mushroom body development, and visual processing, traits which have recently evolved in association with individual recognition. The homologous pathways are not under selection in closely related wasps that lack individual recognition. Indeed, the prevalence of candidate cognition loci within the strongest selective sweeps suggests that the evolution of cognitive abilities has been among the strongest selection pressures in P. fuscatus' recent evolutionary history. Detailed analyses of selective sweeps containing candidate cognition loci reveal multiple cases of hard selective sweeps within the last few thousand years on de novo mutations, mainly in noncoding regions. These data provide unprecedented insight into some of the processes by which cognition evolves.
Collapse
|
37
|
Thornton KR. Polygenic Adaptation to an Environmental Shift: Temporal Dynamics of Variation Under Gaussian Stabilizing Selection and Additive Effects on a Single Trait. Genetics 2019; 213:1513-1530. [PMID: 31653678 PMCID: PMC6893385 DOI: 10.1534/genetics.119.302662] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 10/21/2019] [Indexed: 11/26/2022] Open
Abstract
Predictions about the effect of natural selection on patterns of linked neutral variation are largely based on models involving the rapid fixation of unconditionally beneficial mutations. However, when phenotypes adapt to a new optimum trait value, the strength of selection on individual mutations decreases as the population adapts. Here, I use explicit forward simulations of a single trait with additive-effect mutations adapting to an "optimum shift." Detectable "hitchhiking" patterns are only apparent if (i) the optimum shifts are large with respect to equilibrium variation for the trait, (ii) mutation rates to large-effect mutations are low, and (iii) large-effect mutations rapidly increase in frequency and eventually reach fixation, which typically occurs after the population reaches the new optimum. For the parameters simulated here, partial sweeps do not appreciably affect patterns of linked variation, even when the mutations are strongly selected. The contribution of new mutations vs. standing variation to fixation depends on the mutation rate affecting trait values. Given the fixation of a strongly selected variant, patterns of hitchhiking are similar on average for the two classes of sweeps because sweeps from standing variation involving large-effect mutations are rare when the optimum shifts. The distribution of effect sizes of new mutations has little effect on the time to reach the new optimum, but reducing the mutational variance increases the magnitude of hitchhiking patterns. In general, populations reach the new optimum prior to the completion of any sweeps, and the times to fixation are longer for this model than for standard models of directional selection. The long fixation times are due to a combination of declining selection pressures during adaptation and the possibility of interference among weakly selected sites for traits with high mutation rates.
Collapse
Affiliation(s)
- Kevin R Thornton
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697
| |
Collapse
|
38
|
Llanos‐Garrido A, Pérez‐Tris J, Díaz JA. The combined use of raw and phylogenetically independent methods of outlier detection uncovers genome-wide dynamics of local adaptation in a lizard. Ecol Evol 2019; 9:14356-14367. [PMID: 31938524 PMCID: PMC6953648 DOI: 10.1002/ece3.5872] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 10/04/2019] [Accepted: 10/10/2019] [Indexed: 02/06/2023] Open
Abstract
Local adaptation is a dynamic process by which different allele combinations are selected in different populations at different times, and whose genetic signature can be inferred by genome-wide outlier analyses. We combined gene flow estimates with two methods of outlier detection, one of them independent of population coancestry (CIOA) and the other one not (ROA), to identify genetic variants favored when ecology promotes phenotypic convergence. We analyzed genotyping-by-sequencing data from five populations of a lizard distributed over an environmentally heterogeneous range that has been changing since the split of eastern and western lineages ca. 3 mya. Overall, western lizards inhabit forest habitat and are unstriped, whereas eastern ones inhabit shrublands and are striped. However, one population (Lerma) has unstriped phenotype despite its eastern ancestry. The analysis of 73,291 SNPs confirmed the east-west division and identified nonoverlapping sets of outliers (12 identified by ROA and 9 by CIOA). ROA revealed ancestral adaptive variation in the uncovered outliers that were subject to divergent selection and differently fixed for eastern and western populations at the extremes of the environmental gradient. Interestingly, such variation was maintained in Lerma, where we found high levels of heterozygosity for ROA outliers, whereas CIOA uncovered innovative variants that were selected only there. Overall, it seems that both the maintenance of ancestral variation and asymmetric migration have counterbalanced adaptive lineage splitting in our model species. This scenario, which is likely promoted by a changing and heterogeneous environment, could hamper ecological speciation of locally adapted populations despite strong genetic structure between lineages.
Collapse
Affiliation(s)
- Alejandro Llanos‐Garrido
- Informatics GroupFaculty of Arts and SciencesHarvard UniversityCambridgeMAUSA
- Departamento de BiodiversidadUniversidad Complutense de MadridMadridSpain
| | - Javier Pérez‐Tris
- Departamento de BiodiversidadUniversidad Complutense de MadridMadridSpain
| | - José A. Díaz
- Departamento de BiodiversidadUniversidad Complutense de MadridMadridSpain
| |
Collapse
|
39
|
Martin SL, Parent JS, Laforest M, Page E, Kreiner JM, James T. Population Genomic Approaches for Weed Science. PLANTS (BASEL, SWITZERLAND) 2019; 8:E354. [PMID: 31546893 PMCID: PMC6783936 DOI: 10.3390/plants8090354] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 09/12/2019] [Accepted: 09/14/2019] [Indexed: 12/16/2022]
Abstract
Genomic approaches are opening avenues for understanding all aspects of biological life, especially as they begin to be applied to multiple individuals and populations. However, these approaches typically depend on the availability of a sequenced genome for the species of interest. While the number of genomes being sequenced is exploding, one group that has lagged behind are weeds. Although the power of genomic approaches for weed science has been recognized, what is needed to implement these approaches is unfamiliar to many weed scientists. In this review we attempt to address this problem by providing a primer on genome sequencing and provide examples of how genomics can help answer key questions in weed science such as: (1) Where do agricultural weeds come from; (2) what genes underlie herbicide resistance; and, more speculatively, (3) can we alter weed populations to make them easier to control? This review is intended as an introduction to orient weed scientists who are thinking about initiating genome sequencing projects to better understand weed populations, to highlight recent publications that illustrate the potential for these methods, and to provide direction to key tools and literature that will facilitate the development and execution of weed genomic projects.
Collapse
Affiliation(s)
- Sara L Martin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| | - Jean-Sebastien Parent
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| | - Martin Laforest
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, QC J3B 3E6, Canada.
| | - Eric Page
- Harrow Research and Development Centre, Agriculture and Agri-Food Canada, Harrow, ON N0R 1G0, Canada.
| | - Julia M Kreiner
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON M5S 3B2, Canada.
| | - Tracey James
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada.
| |
Collapse
|
40
|
Stern AJ, Wilton PR, Nielsen R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet 2019; 15:e1008384. [PMID: 31518343 PMCID: PMC6760815 DOI: 10.1371/journal.pgen.1008384] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 09/25/2019] [Accepted: 08/26/2019] [Indexed: 12/24/2022] Open
Abstract
Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. Our method CLUES treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the allele frequency trajectory of a selected or neutral allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, and can provide reliable inferences of allele frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past allele frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also infer the trajectory of a SNP (EDAR) in Han Chinese, finding evidence that this allele's age is much older than previously claimed. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1.
Collapse
Affiliation(s)
- Aaron J. Stern
- Graduate Group in Computation Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Peter R. Wilton
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
41
|
Brennan RS, Garrett AD, Huber KE, Hargarten H, Pespeni MH. Rare genetic variation and balanced polymorphisms are important for survival in global change conditions. Proc Biol Sci 2019; 286:20190943. [PMID: 31185858 DOI: 10.1098/rspb.2019.0943] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Standing genetic variation is important for population persistence in extreme environmental conditions. While some species may have the capacity to adapt to predicted average future global change conditions, the ability to survive extreme events is largely unknown. We used single-generation selection experiments on hundreds of thousands of Strongylocentrotus purpuratus sea urchin larvae generated from wild-caught adults to identify adaptive genetic variation responsive to moderate (pH 8.0) and extreme (pH 7.5) low-pH conditions. Sequencing genomic DNA from pools of larvae, we identified consistent changes in allele frequencies across replicate cultures for each pH condition and observed increased linkage disequilibrium around selected loci, revealing selection on recombined standing genetic variation. We found that loci responding uniquely to either selection regime were at low starting allele frequencies while variants that responded to both pH conditions (11.6% of selected variants) started at high frequencies. Loci under selection performed functions related to energetics, pH tolerance, cell growth and actin/cytoskeleton dynamics. These results highlight that persistence in future conditions will require two classes of genetic variation: common, pH-responsive variants maintained by balancing selection in a heterogeneous environment, and rare variants, particularly for extreme conditions, that must be maintained by large population sizes.
Collapse
Affiliation(s)
- Reid S Brennan
- Department of Biology, University of Vermont , Burlington, VT , USA
| | - April D Garrett
- Department of Biology, University of Vermont , Burlington, VT , USA
| | - Kaitlin E Huber
- Department of Biology, University of Vermont , Burlington, VT , USA
| | - Heidi Hargarten
- Department of Biology, University of Vermont , Burlington, VT , USA
| | | |
Collapse
|
42
|
Mokhber M, Shahrbabak MM, Sadeghi M, Shahrbabak HM, Stella A, Nicolzzi E, Williams JL. Study of whole genome linkage disequilibrium patterns of Iranian water buffalo breeds using the Axiom Buffalo Genotyping 90K Array. PLoS One 2019; 14:e0217687. [PMID: 31150486 PMCID: PMC6544294 DOI: 10.1371/journal.pone.0217687] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 05/16/2019] [Indexed: 01/21/2023] Open
Abstract
Accuracy of genome-wide association studies, and the successful implementation of genomic selection depends on the level of linkage disequilibrium (LD) across the genome and also the persistence of LD phase between populations. In the present study LD between adjacent SNPs and LD decay between SNPs was calculated in three Iranian water buffalo populations. Persistence of LD phase was evaluated across these populations and effective population size (Ne) was estimated from corrected r2 information. A set of 404 individuals from three Iranian buffalo populations were genotyped with the Axiom Buffalo Genotyping 90K Array. Average r2 and |D'| between adjacent SNP pairs across all chromosomes was 0.27 and 0.66 for AZI, 0.29 and 0.68 for KHU, and 0.32 and 0.72 for MAZ. The LD between the SNPs decreased with increasing physical distance from 100Kb to 1Mb between markers, from 0.234 to 0.018 for AZI, 0.254 to 0.034 for KHU, and 0.297 to 0.119 for MAZ, respectively. These results indicate that a density of 90K SNP is sufficient for genomic analyses relying on long range LD (e.g. GWAS and genomic selection). The persistence of LD phase decreased with increasing marker distances across all the populations, but remained above 0.8 for AZI and KHU for marker distances up to 100Kb. For multi-breed genomic evaluation, the 90K SNP panel is suitable for AZI and KHU buffalo breeds. Estimated effective population sizes for AZI, KHU and MAZ were 477, 212 and 32, respectively, for recent generations. The estimated effective population sizes indicate that the MAZ is at risk and requires careful management.
Collapse
Affiliation(s)
- Mahdi Mokhber
- Department of Animal Science, Faculty of Agriculture, Urmia University, Urmia, Iran
- * E-mail:
| | - Mohammad Moradi Shahrbabak
- Department of Animal Science, Faculty of Agricultural Science and Engineering, University College of Agriculture and Natural Resources (UTCAN), University of Tehran, Karaj, Iran
| | - Mostafa Sadeghi
- Department of Animal Science, Faculty of Agricultural Science and Engineering, University College of Agriculture and Natural Resources (UTCAN), University of Tehran, Karaj, Iran
| | - Hossein Moradi Shahrbabak
- Department of Animal Science, Faculty of Agricultural Science and Engineering, University College of Agriculture and Natural Resources (UTCAN), University of Tehran, Karaj, Iran
| | | | | | - John L. Williams
- Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, South Australia, Australia
| |
Collapse
|
43
|
Nadachowska-Brzyska K, Burri R, Ellegren H. Footprints of adaptive evolution revealed by whole Z chromosomes haplotypes in flycatchers. Mol Ecol 2019; 28:2290-2304. [PMID: 30653779 PMCID: PMC6852393 DOI: 10.1111/mec.15021] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 09/13/2018] [Accepted: 09/14/2018] [Indexed: 01/19/2023]
Abstract
Detecting positive selection using genomic data is critical to understanding the role of adaptive evolution. Of particular interest in this context is sex chromosomes since they are thought to play a special role in local adaptation and speciation. We sought to circumvent the challenges associated with statistical phasing when using haplotype-based statistics in sweep scans by benefitting from that whole chromosome haplotypes of the sex chromosomes can be obtained by resequencing of individuals of the hemizygous sex. We analyzed whole Z chromosome haplotypes from 100 females from several populations of four black and white flycatcher species (in birds, females are ZW and males ZZ). Based on integrated haplotype score (iHS) and number of segregating sites by length (nSL) statistics, we found strong and frequent haplotype structure in several regions of the Z chromosome in each species. Most of these sweep signals were population-specific, with essentially no evidence for regions under selection shared among species. Some completed sweeps were revealed by the cross-population extended haplotype homozygosity (XP-EHH) statistic. Importantly, by using statistically phased Z chromosome data from resequencing of males, we failed to recover the signals of selection detected in analyses based on whole chromosome haplotypes from females; instead, what likely represent false signals of selection were frequently seen. This highlights the power issues in statistical phasing and cautions against conclusions from selection scans using such data. The detection of frequent selective sweeps on the avian Z chromosome supports a large role of sex chromosomes in adaptive evolution.
Collapse
Affiliation(s)
| | - Reto Burri
- Department of Evolutionary Biology, University of Uppsala, Uppsala, Sweden.,Department of Population Ecology, Friedrich Schiller University Jena, Jena, Germany
| | - Hans Ellegren
- Department of Evolutionary Biology, University of Uppsala, Uppsala, Sweden
| |
Collapse
|
44
|
Abstract
For almost 20 years, many inference methods have been developed to detect selective sweeps and localize the targets of directional selection in the genome. These methods are based on population genetic models that describe the effect of a beneficial allele (e.g., a new mutation) on linked neutral variation (driven by directional selection from a single copy to fixation). Here, I discuss these models, ranging from selective sweeps in a panmictic population of constant size to evolutionary traffic when simultaneous sweeps at multiple loci interfere, and emphasize the important role of demography and population structure in data analysis. In the past 10 years, soft sweeps that may arise after an environmental change from directional selection on standing variation have become a focus of population genetic research. In contrast to selective sweeps, they are caused by beneficial alleles that were neutrally segregating in a population before the environmental change or were present at a mutation-selection balance in appreciable frequency.
Collapse
|
45
|
Abstract
It is a tenet of modern biology that species adapt through natural selection to cope with the ever-changing environment. By comparing genetic variants between the island and mainland populations of a passerine, we inferred the related age of genetic variants across its entire genome and suggest that preexisting standing variants played the predominant role in local adaptation. Our findings not only resolve a long-standing fundamental problem in biology regarding the genetic sources of adaptation, but imply that the evolutionary potential of a population is highly associated with its preexisting genetic variation. What kind of genetic variation contributes the most to adaptation is a fundamental question in evolutionary biology. By resequencing genomes of 80 individuals, we inferred the origin of genomic variants associated with a complex adaptive syndrome involving multiple quantitative traits, namely, adaptation between high and low altitudes, in the vinous-throated parrotbill (Sinosuthora webbiana) in Taiwan. By comparing these variants with those in the Asian mainland population, we revealed standing variation in 24 noncoding genomic regions to be the predominant genetic source of adaptation. Parrotbills at both high and low altitudes exhibited signatures of recent selection, suggesting that not only the front but also the trailing edges of postglacial expanding populations could be subjected to environmental stresses. This study verifies and quantifies the importance of standing variation in adaptation in a cohort of genes, illustrating that the evolutionary potential of a population depends significantly on its preexisting genetic diversity. These findings provide important context for understanding adaptation and conservation of species in the Anthropocene.
Collapse
|
46
|
Mateo L, Rech GE, González J. Genome-wide patterns of local adaptation in Western European Drosophila melanogaster natural populations. Sci Rep 2018; 8:16143. [PMID: 30385770 PMCID: PMC6212444 DOI: 10.1038/s41598-018-34267-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 10/12/2018] [Indexed: 12/21/2022] Open
Abstract
Signatures of spatially varying selection have been investigated both at the genomic and transcriptomic level in several organisms. In Drosophila melanogaster, the majority of these studies have analyzed North American and Australian populations, leading to the identification of several loci and traits under selection. However, several studies based mainly in North American populations showed evidence of admixture that likely contributed to the observed population differentiation patterns. Thus, disentangling demography from selection might be challenging when analyzing these populations. European populations could help identify loci under spatially varying selection provided that no recent admixture from African populations would have occurred. In this work, we individually sequence the genome of 42 European strains collected in populations from contrasting environments: Stockholm (Sweden) and Castellana Grotte (Southern Italy). We found low levels of population structure and no evidence of recent African admixture in these two populations. We thus look for patterns of spatially varying selection affecting individual genes and gene sets. Besides single nucleotide polymorphisms, we also investigated the role of transposable elements in local adaptation. We concluded that European populations are a good dataset to identify candidate loci under spatially varying selection. The analysis of the two populations sequenced in this work in the context of all the available D. melanogaster data allowed us to pinpoint genes and biological processes likely to be relevant for local adaptation. Identifying and analyzing populations with low levels of population structure and admixture should help to disentangle selective from non-selective forces underlying patterns of population differentiation in other species as well.
Collapse
Affiliation(s)
- Lidia Mateo
- Institute of Evolutionary Biology. CSIC-Universitat Pompeu Fabra. Passeig Maritim de la Barceloneta, 37-49. 08003, Barcelona, Spain
| | - Gabriel E Rech
- Institute of Evolutionary Biology. CSIC-Universitat Pompeu Fabra. Passeig Maritim de la Barceloneta, 37-49. 08003, Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology. CSIC-Universitat Pompeu Fabra. Passeig Maritim de la Barceloneta, 37-49. 08003, Barcelona, Spain.
| |
Collapse
|
47
|
Detection and Classification of Hard and Soft Sweeps from Unphased Genotypes by Multilocus Genotype Identity. Genetics 2018; 210:1429-1452. [PMID: 30315068 DOI: 10.1534/genetics.118.301502] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Accepted: 10/08/2018] [Indexed: 11/18/2022] Open
Abstract
Positive natural selection can lead to a decrease in genomic diversity at the selected site and at linked sites, producing a characteristic signature of elevated expected haplotype homozygosity. These selective sweeps can be hard or soft. In the case of a hard selective sweep, a single adaptive haplotype rises to high population frequency, whereas multiple adaptive haplotypes sweep through the population simultaneously in a soft sweep, producing distinct patterns of genetic variation in the vicinity of the selected site. Measures of expected haplotype homozygosity have previously been used to detect sweeps in multiple study systems. However, these methods are formulated for phased haplotype data, typically unavailable for nonmodel organisms, and some may have reduced power to detect soft sweeps due to their increased genetic diversity relative to hard sweeps. To address these limitations, we applied the H12 and H2/H1 statistics proposed in 2015 by Garud et al., which have power to detect both hard and soft sweeps, to unphased multilocus genotypes, denoting them as G12 and G2/G1. G12 (and the more direct expected homozygosity analog to H12, denoted G123) has comparable power to H12 for detecting both hard and soft sweeps. G2/G1 can be used to classify hard and soft sweeps analogously to H2/H1, conditional on a genomic region having high G12 or G123 values. The reason for this power is that, under random mating, the most frequent haplotypes will yield the most frequent multilocus genotypes. Simulations based on parameters compatible with our recent understanding of human demographic history suggest that expected homozygosity methods are best suited for detecting recent sweeps, and increase in power under recent population expansions. Finally, we find candidates for selective sweeps within the 1000 Genomes CEU, YRI, GIH, and CHB populations, which corroborate and complement existing studies.
Collapse
|
48
|
Consistent signatures of selection from genomic analysis of pairs of temporal and spatial Plasmodium falciparum populations from The Gambia. Sci Rep 2018; 8:9687. [PMID: 29946063 PMCID: PMC6018809 DOI: 10.1038/s41598-018-28017-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 06/14/2018] [Indexed: 11/16/2022] Open
Abstract
Genome sequences of 247 Plasmodium falciparum isolates collected in The Gambia in 2008 and 2014 were analysed to identify changes possibly related to the scale-up of antimalarial interventions that occurred during this period. Overall, there were 15 regions across the genomes with signatures of positive selection. Five of these were sweeps around known drug resistance and antigenic loci. Signatures at antigenic loci such as thrombospodin related adhesive protein (Pftrap) were most frequent in eastern Gambia, where parasite prevalence and transmission remain high. There was a strong temporal differentiation at a non-synonymous SNP in a cysteine desulfarase (Pfnfs) involved in iron-sulphur complex biogenesis. During the 7-year period, the frequency of the lysine variant at codon 65 (Pfnfs-Q65K) increased by 22% (10% to 32%) in the Greater Banjul area. Between 2014 and 2015, the frequency of this variant increased by 6% (20% to 26%) in eastern Gambia. IC50 for lumefantrine was significantly higher in Pfnfs-65K isolates. This is probably the first evidence of directional selection on Pfnfs or linked loci by lumefantrine. Given the declining malaria transmission, the consequent loss of population immunity, and sustained drug pressure, it is important to monitor Gambian P. falciparum populations for further signs of adaptation.
Collapse
|
49
|
Satta Y, Fujito NT, Takahata N. Nonequilibrium Neutral Theory for Hitchhikers. Mol Biol Evol 2018; 35:1362-1365. [PMID: 29722819 DOI: 10.1093/molbev/msy093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Selective sweep is a phenomenon of reduced variation at presumably neutrally evolving sites (hitchhikers) in the genome that is caused by the spread of a selected allele at a linked focal site, and is widely used to test for action of positive selection. Nonetheless, selective sweep may also provide an unprecedented opportunity for studying nonequilibrium properties of the neutral variation itself. We have demonstrated this possibility in relation to ancient selective sweep for modern human-specific changes and ongoing selective sweep for local population-specific changes.
Collapse
Affiliation(s)
- Yoko Satta
- Department of Evolutionary Studies of Biosystems, School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, Japan
| | - Naoko T Fujito
- Department of Evolutionary Studies of Biosystems, School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, Japan
| | - Naoyuki Takahata
- Department of Evolutionary Studies of Biosystems, School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, Japan
| |
Collapse
|
50
|
Detecting Recent Positive Selection with a Single Locus Test Bipartitioning the Coalescent Tree. Genetics 2017; 208:791-805. [PMID: 29217523 DOI: 10.1534/genetics.117.300401] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Accepted: 12/01/2017] [Indexed: 01/09/2023] Open
Abstract
Many population genomic studies have been conducted in the past to search for traces of recent events of positive selection. These traces, however, can be obscured by temporal variation of population size or other demographic factors. To reduce the confounding impact of demography, the coalescent tree topology has been used as an additional source of information for detecting recent positive selection in a population or a species. Based on the branching pattern at the root, we partition the hypothetical coalescent tree, inferred from a sequence sample, into two subtrees. The reasoning is that positive selection could impose a strong impact on branch length in one of the two subtrees while demography has the same effect on average on both subtrees. Thus, positive selection should be detectable by comparing statistics calculated for the two subtrees. Simulations demonstrate that the proposed test based on these principles has high power to detect recent positive selection even when DNA polymorphism data from only one locus is available, and that it is robust to the confounding effect of demography. One feature is that all components in the summary statistics ([Formula: see text]) can be computed analytically. Moreover, misinference of derived and ancestral alleles is seen to have only a limited effect on the test, and it therefore avoids a notorious problem when searching for traces of recent positive selection.
Collapse
|