51
|
Enard D, Petrov DA. Ancient RNA virus epidemics through the lens of recent adaptation in human genomes. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190575. [PMID: 33012231 PMCID: PMC7702803 DOI: 10.1098/rstb.2019.0575] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Over the course of the last several million years of evolution, humans probably have been plagued by hundreds or perhaps thousands of epidemics. Little is known about such ancient epidemics and a deep evolutionary perspective on current pathogenic threats is lacking. The study of past epidemics has typically been limited in temporal scope to recorded history, and in physical scope to pathogens that left sufficient DNA behind, such as Yersinia pestis during the Great Plague. Host genomes, however, offer an indirect way to detect ancient epidemics beyond the current temporal and physical limits. Arms races with pathogens have shaped the genomes of the hosts by driving a large number of adaptations at many genes, and these signals can be used to detect and further characterize ancient epidemics. Here, we detect the genomic footprints left by ancient viral epidemics that took place in the past approximately 50 000 years in the 26 human populations represented in the 1000 Genomes Project. By using the enrichment in signals of adaptation at approximately 4500 host loci that interact with specific types of viruses, we provide evidence that RNA viruses have driven a particularly large number of adaptive events across diverse human populations. These results suggest that different types of viruses may have exerted different selective pressures during human evolution. Knowledge of these past selective pressures will provide a deeper evolutionary perspective on current pathogenic threats. This article is part of the theme issue ‘Insights into health and disease from ancient biomolecules’.
Collapse
Affiliation(s)
- David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, CA, USA
| |
Collapse
|
52
|
Schrider DR. Background Selection Does Not Mimic the Patterns of Genetic Diversity Produced by Selective Sweeps. Genetics 2020; 216:499-519. [PMID: 32847814 PMCID: PMC7536861 DOI: 10.1534/genetics.120.303469] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 08/04/2020] [Indexed: 12/28/2022] Open
Abstract
It is increasingly evident that natural selection plays a prominent role in shaping patterns of diversity across the genome. The most commonly studied modes of natural selection are positive selection and negative selection, which refer to directional selection for and against derived mutations, respectively. Positive selection can result in hitchhiking events, in which a beneficial allele rapidly replaces all others in the population, creating a valley of diversity around the selected site along with characteristic skews in allele frequencies and linkage disequilibrium among linked neutral polymorphisms. Similarly, negative selection reduces variation not only at selected sites but also at linked sites, a phenomenon called background selection (BGS). Thus, discriminating between these two forces may be difficult, and one might expect efforts to detect hitchhiking to produce an excess of false positives in regions affected by BGS. Here, we examine the similarity between BGS and hitchhiking models via simulation. First, we show that BGS may somewhat resemble hitchhiking in simplistic scenarios in which a region constrained by negative selection is flanked by large stretches of unconstrained sites, echoing previous results. However, this scenario does not mirror the actual spatial arrangement of selected sites across the genome. By performing forward simulations under more realistic scenarios of BGS, modeling the locations of protein-coding and conserved noncoding DNA in real genomes, we show that the spatial patterns of variation produced by BGS rarely mimic those of hitchhiking events. Indeed, BGS is not substantially more likely than neutrality to produce false signatures of hitchhiking. This holds for simulations modeled after both humans and Drosophila, and for several different demographic histories. These results demonstrate that appropriately designed scans for hitchhiking need not consider BGS's impact on false-positive rates. However, we do find evidence that BGS increases the false-negative rate for hitchhiking, an observation that demands further investigation.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27514
| |
Collapse
|
53
|
Horscroft C, Ennis S, Pengelly RJ, Sluckin TJ, Collins A. Sequencing era methods for identifying signatures of selection in the genome. Brief Bioinform 2020; 20:1997-2008. [PMID: 30053138 DOI: 10.1093/bib/bby064] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 05/16/2018] [Indexed: 12/12/2022] Open
Abstract
Insights into genetic loci which are under selection and their functional roles contribute to increased understanding of the patterns of phenotypic variation we observe today. The availability of whole-genome sequence data, for humans and other species, provides opportunities to investigate adaptation and evolution at unprecedented resolution. Many analytical methods have been developed to interrogate these large data sets and characterize signatures of selection in the genome. We review here recently developed methods and consider the impact of increased computing power and data availability on the detection of selection signatures. Consideration of demography, recombination and other confounding factors is important, and use of a range of methods in combination is a powerful route to resolving different forms of selection in genome sequence data. Overall, a substantial improvement in methods for application to whole-genome sequencing is evident, although further work is required to develop robust and computationally efficient approaches which may increase reproducibility across studies.
Collapse
Affiliation(s)
- Clare Horscroft
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Sarah Ennis
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Reuben J Pengelly
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| | - Timothy J Sluckin
- Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK.,Mathematical Sciences, University of Southampton, Highfield, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Bioinformatics, Faculty of Medicine, University of Southampton, Duthie Building (808), Tremona Road, Southampton, UK.,Institute for Life Sciences, University of Southampton, Life Sciences Building (85), Highfield, Southampton, UK
| |
Collapse
|
54
|
Lewis JJ, Van Belleghem SM, Papa R, Danko CG, Reed RD. Many functionally connected loci foster adaptive diversification along a neotropical hybrid zone. SCIENCE ADVANCES 2020; 6:6/39/eabb8617. [PMID: 32978147 PMCID: PMC7518860 DOI: 10.1126/sciadv.abb8617] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 08/11/2020] [Indexed: 05/02/2023]
Abstract
Characterizing the genetic complexity of adaptation and trait evolution is a major emphasis of evolutionary biology and genetics. Incongruent findings from genetic studies have resulted in conceptual models ranging from a few large-effect loci to massively polygenic architectures. Here, we combine chromatin immunoprecipitation sequencing, Hi-C, RNA sequencing, and 40 whole-genome sequences from Heliconius butterflies to show that red color pattern diversification occurred via many genomic loci. We find that the red wing pattern master regulatory transcription factor Optix binds dozens of loci also under selection, which frequently form three-dimensional adaptive hubs with selection acting on multiple physically interacting genes. Many Optix-bound genes under selection are tied to pigmentation and wing development, and these loci collectively maintain separation between adaptive red color pattern phenotypes in natural populations. We propose a model of trait evolution where functional connections between loci may resolve much of the disparity between large-effect and polygenic evolutionary models.
Collapse
Affiliation(s)
- James J Lewis
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA.
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, USA
| | | | - Riccardo Papa
- Department of Biology, University of Puerto Rico-Rio Piedras, San Juan, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, Puerto Rico
| | - Charles G Danko
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, USA
| | - Robert D Reed
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, USA
| |
Collapse
|
55
|
Mughal MR, Koch H, Huang J, Chiaromonte F, DeGiorgio M. Learning the properties of adaptive regions with functional data analysis. PLoS Genet 2020; 16:e1008896. [PMID: 32853200 PMCID: PMC7480868 DOI: 10.1371/journal.pgen.1008896] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 09/09/2020] [Accepted: 05/29/2020] [Indexed: 12/12/2022] Open
Abstract
Identifying regions of positive selection in genomic data remains a challenge in population genetics. Most current approaches rely on comparing values of summary statistics calculated in windows. We present an approach termed SURFDAWave, which translates measures of genetic diversity calculated in genomic windows to functional data. By transforming our discrete data points to be outputs of continuous functions defined over genomic space, we are able to learn the features of these functions that signify selection. This enables us to confidently identify complex modes of natural selection, including adaptive introgression. We are also able to predict important selection parameters that are responsible for shaping the inferred selection events. By applying our model to human population-genomic data, we recapitulate previously identified regions of selective sweeps, such as OCA2 in Europeans, and predict that its beneficial mutation reached a frequency of 0.02 before it swept 1,802 generations ago, a time when humans were relatively new to Europe. In addition, we identify BNC2 in Europeans as a target of adaptive introgression, and predict that it harbors a beneficial mutation that arose in an archaic human population that split from modern humans within the hypothesized modern human-Neanderthal divergence range.
Collapse
Affiliation(s)
- Mehreen R. Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Hillary Koch
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jinguo Huang
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Francesca Chiaromonte
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
| |
Collapse
|
56
|
Rees JS, Castellano S, Andrés AM. The Genomics of Human Local Adaptation. Trends Genet 2020; 36:415-428. [DOI: 10.1016/j.tig.2020.03.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 03/16/2020] [Accepted: 03/18/2020] [Indexed: 01/23/2023]
|
57
|
Mathieson I. Human adaptation over the past 40,000 years. Curr Opin Genet Dev 2020; 62:97-104. [PMID: 32745952 PMCID: PMC7484260 DOI: 10.1016/j.gde.2020.06.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/10/2020] [Accepted: 06/01/2020] [Indexed: 02/07/2023]
Abstract
Over the past few years several methodological and data-driven advances have greatly improved our ability to robustly detect genomic signatures of selection in humans. New methods applied to large samples of present-day genomes provide increased power, while ancient DNA allows precise estimation of timing and tempo. However, despite these advances, we are still limited in our ability to translate these signatures into understanding about which traits were actually under selection, and why. Combining information from different populations and timescales may allow interpretation of selective sweeps. Other modes of selection have proved more difficult to detect. In particular, despite strong evidence of the polygenicity of most human traits, evidence for polygenic selection is weak, and its importance in recent human evolution remains unclear. Balancing selection and archaic introgression seem important for the maintenance of potentially adaptive immune diversity, but perhaps less so for other traits.
Collapse
Affiliation(s)
- Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, United States.
| |
Collapse
|
58
|
Sazzini M, Abondio P, Sarno S, Gnecchi-Ruscone GA, Ragno M, Giuliani C, De Fanti S, Ojeda-Granados C, Boattini A, Marquis J, Valsesia A, Carayol J, Raymond F, Pirazzini C, Marasco E, Ferrarini A, Xumerle L, Collino S, Mari D, Arosio B, Monti D, Passarino G, D'Aquila P, Pettener D, Luiselli D, Castellani G, Delledonne M, Descombes P, Franceschi C, Garagnani P. Genomic history of the Italian population recapitulates key evolutionary dynamics of both Continental and Southern Europeans. BMC Biol 2020; 18:51. [PMID: 32438927 PMCID: PMC7243322 DOI: 10.1186/s12915-020-00778-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 04/01/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The cline of human genetic diversity observable across Europe is recapitulated at a micro-geographic scale by variation within the Italian population. Besides resulting from extensive gene flow, this might be ascribable also to local adaptations to diverse ecological contexts evolved by people who anciently spread along the Italian Peninsula. Dissecting the evolutionary history of the ancestors of present-day Italians may thus improve the understanding of demographic and biological processes that contributed to shape the gene pool of European populations. However, previous SNP array-based studies failed to investigate the full spectrum of Italian variation, generally neglecting low-frequency genetic variants and examining a limited set of small effect size alleles, which may represent important determinants of population structure and complex adaptive traits. To overcome these issues, we analyzed 38 high-coverage whole-genome sequences representative of population clusters at the opposite ends of the cline of Italian variation, along with a large panel of modern and ancient Euro-Mediterranean genomes. RESULTS We provided evidence for the early divergence of Italian groups dating back to the Late Glacial and for Neolithic and distinct Bronze Age migrations having further differentiated their gene pools. We inferred adaptive evolution at insulin-related loci in people from Italian regions with a temperate climate, while possible adaptations to pathogens and ultraviolet radiation were observed in Mediterranean Italians. Some of these adaptive events may also have secondarily modulated population disease or longevity predisposition. CONCLUSIONS We disentangled the contribution of multiple migratory and adaptive events in shaping the heterogeneous Italian genomic background, which exemplify population dynamics and gene-environment interactions that played significant roles also in the formation of the Continental and Southern European genomic landscapes.
Collapse
Affiliation(s)
- Marco Sazzini
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy.
- Interdepartmental Centre Alma Mater Research Institute on Global Challenges and Climate Change, University of Bologna, Bologna, Italy.
| | - Paolo Abondio
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Stefania Sarno
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | | | - Matteo Ragno
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Cristina Giuliani
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Sara De Fanti
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Claudia Ojeda-Granados
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
- Department of Molecular Biology in Medicine, Civil Hospital of Guadalajara "Fray Antonio Alcalde" and Health Sciences Center, University of Guadalajara, Guadalajara, Jalisco, Mexico
| | - Alessio Boattini
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Julien Marquis
- Nestlé Research, EPFL Innovation Park, Lausanne, Switzerland
- Current Address: Lausanne Genomic Technologies Facility, University of Lausanne, Lausanne, Switzerland
| | - Armand Valsesia
- Nestlé Research, EPFL Innovation Park, Lausanne, Switzerland
| | - Jerome Carayol
- Nestlé Research, EPFL Innovation Park, Lausanne, Switzerland
| | | | - Chiara Pirazzini
- IRCCS Bologna Institute of Neurological Sciences, Bologna, Italy
| | - Elena Marasco
- Department of Experimental, Diagnostic, and Specialty Medicine, University of Bologna, Bologna, Italy
- Applied Biomedical Research Center (CRBA), S. Orsola-Malpighi Polyclinic, Bologna, Italy
| | - Alberto Ferrarini
- Functional Genomics Laboratory, Department of Biotechnology, University of Verona, Verona, Italy
- Current Address: Menarini Silicon Biosystems SpA, Castel Maggiore, Bologna, Italy
| | - Luciano Xumerle
- Functional Genomics Laboratory, Department of Biotechnology, University of Verona, Verona, Italy
| | | | - Daniela Mari
- Geriatric Unit, Fondazione Ca' Granda, IRCCS Ospedale Maggiore Policlinico, Milan, Italy
| | - Beatrice Arosio
- Geriatric Unit, Fondazione Ca' Granda, IRCCS Ospedale Maggiore Policlinico, Milan, Italy
| | - Daniela Monti
- Department of Experimental and Clinical Biomedical Sciences "Mario Serio", University of Florence, Florence, Italy
| | - Giuseppe Passarino
- Department of Biology, Ecology and Earth Sciences, University of Calabria, Rende, Italy
| | - Patrizia D'Aquila
- Department of Biology, Ecology and Earth Sciences, University of Calabria, Rende, Italy
| | - Davide Pettener
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Donata Luiselli
- Department of Cultural Heritage, University of Bologna, Ravenna, Italy
| | - Gastone Castellani
- Interdepartmental Centre Alma Mater Research Institute on Global Challenges and Climate Change, University of Bologna, Bologna, Italy
- Department of Experimental, Diagnostic, and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Massimo Delledonne
- Functional Genomics Laboratory, Department of Biotechnology, University of Verona, Verona, Italy
| | | | - Claudio Franceschi
- Department of Applied Mathematics, Institute of Information Technology, Lobachevsky University of Nizhny Novgorod, Nizhny Novgorod, Russia
| | - Paolo Garagnani
- Interdepartmental Centre Alma Mater Research Institute on Global Challenges and Climate Change, University of Bologna, Bologna, Italy.
- Department of Experimental, Diagnostic, and Specialty Medicine, University of Bologna, Bologna, Italy.
- Clinical Chemistry, Department of Laboratory Medicine, Karolinska Institutet at Huddinge University Hospital, Stockholm, Sweden.
| |
Collapse
|
59
|
Harris AM, DeGiorgio M. Identifying and Classifying Shared Selective Sweeps from Multilocus Data. Genetics 2020; 215:143-171. [PMID: 32152048 PMCID: PMC7198270 DOI: 10.1534/genetics.120.303137] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 02/29/2020] [Indexed: 11/18/2022] Open
Abstract
Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term [Formula: see text] and [Formula: see text] to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
- Molecular, Cellular, and Integrative Biosciences at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431
| |
Collapse
|
60
|
Abstract
Nervous systems allow animals to acutely respond and behaviorally adapt to changes and recurring patterns in their environment at multiple timescales-from milliseconds to years. Behavior is further shaped at intergenerational timescales by genetic variation, drift, and selection. This sophistication and flexibility of behavior makes it challenging to measure behavior consistently in individual subjects and to compare it across individuals. In spite of these challenges, careful behavioral observations in nature and controlled measurements in the laboratory, combined with modern technologies and powerful genetic approaches, have led to important discoveries about the way genetic variation shapes behavior. A critical mass of genes whose variation is known to modulate behavior in nature is finally accumulating, allowing us to recognize emerging patterns. In this review, we first discuss genetic mapping approaches useful for studying behavior. We then survey how variation acts at different levels-in environmental sensation, in internal neuronal circuits, and outside the nervous system altogether-and then discuss the sources and types of molecular variation linked to behavior and the mechanisms that shape such variation. We end by discussing remaining questions in the field.
Collapse
Affiliation(s)
- Natalie Niepoth
- Zuckerman Mind Brain Behavior Institute and Department of Ecology, Evolution, and Environmental Biology, Columbia University, New York, NY 10027, USA; ,
| | - Andres Bendesky
- Zuckerman Mind Brain Behavior Institute and Department of Ecology, Evolution, and Environmental Biology, Columbia University, New York, NY 10027, USA; ,
| |
Collapse
|
61
|
Derbyshire MC. Bioinformatic Detection of Positive Selection Pressure in Plant Pathogens: The Neutral Theory of Molecular Sequence Evolution in Action. Front Microbiol 2020; 11:644. [PMID: 32328056 PMCID: PMC7160247 DOI: 10.3389/fmicb.2020.00644] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 03/20/2020] [Indexed: 11/13/2022] Open
Abstract
The genomes of plant pathogenic fungi and oomycetes are often exposed to strong positive selection pressure. During speciation, shifts in host range and preference can lead to major adaptive changes. Furthermore, evolution of total host resistance to most isolates can force rapid evolutionary changes in host-specific pathogens. Crop pathogens are subjected to particularly intense selective pressures from monocultures and fungicides. Detection of the footprints of positive selection in plant pathogen genomes is a worthwhile endeavor as it aids understanding of the fundamental biology of these important organisms. There are two main classes of test for detection of positively selected alleles. Tests based on the ratio of non-synonymous to synonymous substitutions per site detect the footprints of multiple fixation events between divergent lineages. Thus, they are well-suited to the study of ancient adaptation events spanning speciations. On the other hand, tests that scan genomes for local fluctuations in allelic diversity within populations are suitable for detection of recent positive selection in populations. In this review, I briefly describe some of the more widely used tests of positive selection and the theory underlying them. I then discuss various examples of their application to plant pathogen genomes, emphasizing the types of genes that are associated with signatures of positive selection. I conclude with a discussion of the practicality of such tests for identification of pathogen genes of interest and the important features of pathogen ecology that must be taken into account for accurate interpretation.
Collapse
Affiliation(s)
- Mark C Derbyshire
- Centre for Crop and Disease Management, School of Molecular and Life Sciences, Curtin University, Perth, WA, Australia
| |
Collapse
|
62
|
Hartfield M, Bataillon T. Selective Sweeps Under Dominance and Inbreeding. G3 (BETHESDA, MD.) 2020; 10:1063-1075. [PMID: 31974096 PMCID: PMC7056974 DOI: 10.1534/g3.119.400919] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 01/18/2020] [Indexed: 12/26/2022]
Abstract
A major research goal in evolutionary genetics is to uncover loci experiencing positive selection. One approach involves finding 'selective sweeps' patterns, which can either be 'hard sweeps' formed by de novo mutation, or 'soft sweeps' arising from recurrent mutation or existing standing variation. Existing theory generally assumes outcrossing populations, and it is unclear how dominance affects soft sweeps. We consider how arbitrary dominance and inbreeding via self-fertilization affect hard and soft sweep signatures. With increased self-fertilization, they are maintained over longer map distances due to reduced effective recombination and faster beneficial allele fixation times. Dominance can affect sweep patterns in outcrossers if the derived variant originates from either a single novel allele, or from recurrent mutation. These models highlight the challenges in distinguishing hard and soft sweeps, and propose methods to differentiate between scenarios.
Collapse
Affiliation(s)
- Matthew Hartfield
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario M5S 3B2, Canada,
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark, and
- Institute of Evolutionary Biology, The University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark, and
| |
Collapse
|
63
|
Scossa F, Fernie AR. The evolution of metabolism: How to test evolutionary hypotheses at the genomic level. Comput Struct Biotechnol J 2020; 18:482-500. [PMID: 32180906 PMCID: PMC7063335 DOI: 10.1016/j.csbj.2020.02.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 02/12/2020] [Accepted: 02/13/2020] [Indexed: 01/21/2023] Open
Abstract
The origin of primordial metabolism and its expansion to form the metabolic networks extant today represent excellent systems to study the impact of natural selection and the potential adaptive role of novel compounds. Here we present the current hypotheses made on the origin of life and ancestral metabolism and present the theories and mechanisms by which the large chemical diversity of plants might have emerged along evolution. In particular, we provide a survey of statistical methods that can be used to detect signatures of selection at the gene and population level, and discuss potential and limits of these methods for investigating patterns of molecular adaptation in plant metabolism.
Collapse
Affiliation(s)
- Federico Scossa
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany
- Council for Agricultural Research and Economics (CREA), Research Centre for Genomics and Bioinformatics (CREA-GB), Via Ardeatina 546, 00178 Rome, Italy
| | - Alisdair R. Fernie
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| |
Collapse
|
64
|
Woerner AE, Veeramah KR, Watkins JC, Hammer MF. The Role of Phylogenetically Conserved Elements in Shaping Patterns of Human Genomic Diversity. Mol Biol Evol 2020; 35:2284-2295. [PMID: 30113695 DOI: 10.1093/molbev/msy145] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Evolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positive-directional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (∼100 bp), widely distributed (comprising ∼5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80×) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GC-biased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human "neutralome," comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
Collapse
Affiliation(s)
- August E Woerner
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ.,Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY
| | | | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ
| |
Collapse
|
65
|
Abstract
Threespine stickleback populations provide a striking example of local adaptation to divergent habitats in populations that are connected by recurrent gene flow. These small fish occur in marine and freshwater habitats throughout the Northern Hemisphere, and in numerous cases the smaller freshwater populations have been established “de novo” from marine colonists. Independently evolved freshwater populations exhibit similar phenotypes that have been shown to derive largely from the same standing genetic variants. Geographic isolation prevents direct migration between the freshwater populations, strongly suggesting that these shared locally adaptive alleles are transported through the marine population. However it is still largely unknown how gene flow, recombination, and selection jointly impact the standing variation that might fuel this adaptation. Here we use individual-based, spatially explicit simulations to determine the levels of gene flow that best match observed patterns of allele sharing among habitats in stickleback. We aim to better understand how gene flow and local adaptation in large metapopulations determine the speed of adaptation and re-use of standing genetic variation. In our simulations we find that repeated adaptation uses a shared set of alleles that are maintained at low frequency by migration-selection balance in oceanic populations. This process occurs over a realistic range of intermediate levels of gene flow that match previous empirical population genomic studies in stickleback. Examining these simulations more deeply reveals how lower levels of gene flow leads to slow, independent adaptation to different habitats, whereas higher levels of gene flow leads to significant mutation load – but an increased probability of successful population genomic scans for locally adapted alleles. Surprisingly, we find that the genealogical origins of most freshwater adapted alleles can be traced back to the original generation of marine individuals that colonized the lakes, as opposed to subsequent migrants. These simulations provide deeper context for existing studies of stickleback evolutionary genomics, and guidance for future empirical studies in this model. More broadly, our results support existing theory of local adaptation but extend it by more completely documenting the genealogical history of adaptive alleles in a metapopulation.
Collapse
|
66
|
Kinney N, Kang L, Eckstrand L, Pulenthiran A, Samuel P, Anandakrishnan R, Varghese RT, Michalak P, Garner HR. Abundance of ethnically biased microsatellites in human gene regions. PLoS One 2019; 14:e0225216. [PMID: 31830051 PMCID: PMC6907796 DOI: 10.1371/journal.pone.0225216] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 10/29/2019] [Indexed: 12/16/2022] Open
Abstract
Microsatellites-a type of short tandem repeat (STR)-have been used for decades as putatively neutral markers to study the genetic structure of diverse human populations. However, recent studies have demonstrated that some microsatellites contribute to gene expression, cis heritability, and phenotype. As a corollary, some microsatellites may contribute to differential gene expression and RNA/protein structure stability in distinct human populations. To test this hypothesis, we investigate genotype frequencies, functional relevance, and adaptive potential of microsatellites in five super-populations (ethnicities) drawn from the 1000 Genomes Project. We discover 3,984 ethnically-biased microsatellite loci (EBML); for each EBML at least one ethnicity has genotype frequencies statistically different from the remaining four. South Asian, East Asian, European, and American EBML show significant overlap; on the contrary, the set of African EBML is mostly unique. We cross-reference the 3,984 EBML with 2,060 previously identified expression STRs (eSTRs); repeats known to affect gene expression (64 total) are over-represented. The most significant pathway enrichments are those associated with the matrisome: a broad collection of genes encoding the extracellular matrix and its associated proteins. At least 14 of the EBML have established links to human disease. Analysis of the 3,984 EBML with respect to known selective sweep regions in the genome shows that allelic variation in some of them is likely associated with adaptive evolution.
Collapse
Affiliation(s)
- Nick Kinney
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Lin Kang
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| | - Laurel Eckstrand
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
| | - Arichanah Pulenthiran
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Peter Samuel
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Ramu Anandakrishnan
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - Robin T. Varghese
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
| | - P. Michalak
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Virginia-Maryland College of Veterinary Medicine, Blacksburg, VA, United States of America
- Institute of Evolution, University of Haifa, Haifa, Israel
| | - Harold R. Garner
- Edward Via College of Osteopathic Medicine, Blacksburg, VA, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, SC, United States of America
| |
Collapse
|
67
|
Schmidt JM, de Manuel M, Marques-Bonet T, Castellano S, Andrés AM. The impact of genetic adaptation on chimpanzee subspecies differentiation. PLoS Genet 2019; 15:e1008485. [PMID: 31765391 PMCID: PMC6901233 DOI: 10.1371/journal.pgen.1008485] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 12/09/2019] [Accepted: 10/17/2019] [Indexed: 12/25/2022] Open
Abstract
Chimpanzees, humans' closest relatives, are in danger of extinction. Aside from direct human impacts such as hunting and habitat destruction, a key threat is transmissible disease. As humans continue to encroach upon their habitats, which shrink in size and grow in density, the risk of inter-population and cross-species viral transmission increases, a point dramatically made in the reverse with the global HIV/AIDS pandemic. Inhabiting central Africa, the four subspecies of chimpanzees differ in demographic history and geographical range, and are likely differentially adapted to their particular local environments. To quantitatively explore genetic adaptation, we investigated the genic enrichment for SNPs highly differentiated between chimpanzee subspecies. Previous analyses of such patterns in human populations exhibited limited evidence of adaptation. In contrast, chimpanzees show evidence of recent positive selection, with differences among subspecies. Specifically, we observe strong evidence of recent selection in eastern chimpanzees, with highly differentiated SNPs being uniquely enriched in genic sites in a way that is expected under recent adaptation but not under neutral evolution or background selection. These sites are enriched for genes involved in immune responses to pathogens, and for genes inferred to differentiate the immune response to infection by simian immunodeficiency virus (SIV) in natural vs. non-natural host species. Conversely, central chimpanzees exhibit an enrichment of signatures of positive selection only at cytokine receptors, due to selective sweeps in CCR3, CCR9 and CXCR6 -paralogs of CCR5 and CXCR4, the two major receptors utilized by HIV to enter human cells. Thus, our results suggest that positive selection has contributed to the genetic and phenotypic differentiation of chimpanzee subspecies, and that viruses likely play a predominate role in this differentiation, with SIV being a likely selective agent. Interestingly, our results suggest that SIV has elicited distinctive adaptive responses in these two chimpanzee subspecies.
Collapse
MESH Headings
- Adaptation, Physiological/genetics
- Adaptation, Physiological/immunology
- Animals
- Demography
- Genetic Drift
- Genetic Speciation
- HIV/genetics
- HIV/immunology
- HIV/pathogenicity
- Humans
- Immunity, Innate/genetics
- Pan troglodytes/genetics
- Pan troglodytes/immunology
- Pan troglodytes/virology
- Polymorphism, Single Nucleotide/genetics
- Receptors, CCR/genetics
- Receptors, CCR3/genetics
- Receptors, CCR5/genetics
- Receptors, CXCR4/genetics
- Receptors, CXCR6/immunology
- Selection, Genetic/genetics
- Simian Immunodeficiency Virus/genetics
- Simian Immunodeficiency Virus/immunology
- Simian Immunodeficiency Virus/pathogenicity
Collapse
Affiliation(s)
- Joshua M. Schmidt
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Leipzig, Germany
- * E-mail: (JMS); (AMA)
| | - Marc de Manuel
- Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas–Universitat Pompeu Fabra), Barcelona, Spain
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas–Universitat Pompeu Fabra), Barcelona, Spain
- National Centre for Genomic Analysis–Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Sergi Castellano
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Leipzig, Germany
- Genetics and Genomic Medicine Programme, Great Ormond Street Institute of Child Health, University College London (UCL), London, United Kingdom
- UCL Genomics, London, United Kingdom
| | - Aida M. Andrés
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Leipzig, Germany
- * E-mail: (JMS); (AMA)
| |
Collapse
|
68
|
Adaptation in structured populations and fuzzy boundaries between hard and soft sweeps. PLoS Comput Biol 2019; 15:e1007426. [PMID: 31710623 PMCID: PMC6872172 DOI: 10.1371/journal.pcbi.1007426] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 11/21/2019] [Accepted: 09/20/2019] [Indexed: 11/19/2022] Open
Abstract
Selective sweeps, the genetic footprint of positive selection, have been extensively studied in the past decades, with dozens of methods developed to identify swept regions. However, these methods suffer from both false positive and false negative reports, and the candidates identified with different methods are often inconsistent with each other. We propose that a biological cause of this problem can be population subdivision, and a technical cause can be incomplete, or inaccurate, modeling of the dynamic process associated with sweeps. Here we used simulations to show how these effects interact and potentially cause bias. In particular, we show that sweeps maybe misclassified as either hard or soft, when the true time stage of a sweep and that implied, or pre-supposed, by the model do not match. We call this "temporal misclassification". Similarly, "spatial misclassification (softening)" can occur when hard sweeps, which are imported by migration into a new subpopulation, are falsely identified as soft. This can easily happen in case of local adaptation, i.e. when the sweeping allele is not under positive selection in the new subpopulation, and the underlying model assumes panmixis instead of substructure. The claim that most sweeps in the evolutionary history of humans were soft, may have to be reconsidered in the light of these findings.
Collapse
|
69
|
Conservation Genomics in a Changing Arctic. Trends Ecol Evol 2019; 35:149-162. [PMID: 31699414 DOI: 10.1016/j.tree.2019.09.008] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Revised: 09/13/2019] [Accepted: 09/17/2019] [Indexed: 12/25/2022]
Abstract
Although logistically challenging to study, the Arctic is a bellwether for global change and is becoming a model for questions pertinent to the persistence of biodiversity. Disruption of Arctic ecosystems is accelerating, with impacts ranging from mixing of biotic communities to individual behavioral responses. Understanding these changes is crucial for conservation and sustainable economic development. Genomic approaches are providing transformative insights into biotic responses to environmental change, but have seen limited application in the Arctic due to a series of limitations. To meet the promise of genome analyses, we urge rigorous development of biorepositories from high latitudes to provide essential libraries to improve the conservation, monitoring, and management of Arctic ecosystems through genomic approaches.
Collapse
|
70
|
Rougeux C, Gagnaire P, Praebel K, Seehausen O, Bernatchez L. Polygenic selection drives the evolution of convergent transcriptomic landscapes across continents within a Nearctic sister species complex. Mol Ecol 2019; 28:4388-4403. [DOI: 10.1111/mec.15226] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 08/06/2019] [Accepted: 08/08/2019] [Indexed: 12/22/2022]
Affiliation(s)
- Clément Rougeux
- Département de biologie Institut de Biologie Intégrative et des Systèmes (IBIS) Université Laval Québec City QC Canada
| | | | - Kim Praebel
- Norwegian College of Fishery Science UiT The Arctic University of Norway Tromsø Norway
| | - Ole Seehausen
- Aquatic Ecology and Evolution Institute of Ecology & Evolution University of Bern Bern Switzerland
| | - Louis Bernatchez
- Département de biologie Institut de Biologie Intégrative et des Systèmes (IBIS) Université Laval Québec City QC Canada
| |
Collapse
|
71
|
Nelson TC, Crandall JG, Ituarte CM, Catchen JM, Cresko WA. Selection, Linkage, and Population Structure Interact To Shape Genetic Variation Among Threespine Stickleback Genomes. Genetics 2019; 212:1367-1382. [PMID: 31213503 PMCID: PMC6707445 DOI: 10.1534/genetics.119.302261] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 06/11/2019] [Indexed: 11/18/2022] Open
Abstract
The outcome of selection on genetic variation depends on the geographic organization of individuals and populations as well as the organization of loci within the genome. Spatially variable selection between marine and freshwater habitats has had a significant and heterogeneous impact on patterns of genetic variation across the genome of threespine stickleback fish. When marine stickleback invade freshwater habitats, more than a quarter of the genome can respond to divergent selection, even in as little as 50 years. This process largely uses standing genetic variation that can be found ubiquitously at low frequency in marine populations, can be millions of years old, and is likely maintained by significant bidirectional gene flow. Here, we combine population genomic data of marine and freshwater stickleback from Cook Inlet, Alaska, with genetic maps of stickleback fish derived from those same populations to examine how linkage to loci under selection affects genetic variation across the stickleback genome. Divergent selection has had opposing effects on linked genetic variation on chromosomes from marine and freshwater stickleback populations: near loci under selection, marine chromosomes are depauperate of variation, while these same regions among freshwater genomes are the most genetically diverse. Forward genetic simulations recapitulate this pattern when different selective environments also differ in population structure. Lastly, dense genetic maps demonstrate that the interaction between selection and population structure may impact large stretches of the stickleback genome. These findings advance our understanding of how the structuring of populations across geography influences the outcomes of selection, and how the recombination landscape broadens the genomic reach of selection.
Collapse
Affiliation(s)
- Thomas C Nelson
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403
- Division of Biological Sciences, University of Montana, Missoula, Montana 59812
| | | | - Catherine M Ituarte
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403
| | - Julian M Catchen
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403
- Department of Animal Biology, University of Illinois at Urbana-Champaign, Illinois 61801
| | - William A Cresko
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403
| |
Collapse
|
72
|
Guillen-Guio B, Lorenzo-Salazar JM, González-Montelongo R, Díaz-de Usera A, Marcelino-Rodríguez I, Corrales A, Cabrera de León A, Alonso S, Flores C. Genomic Analyses of Human European Diversity at the Southwestern Edge: Isolation, African Influence and Disease Associations in the Canary Islands. Mol Biol Evol 2019; 35:3010-3026. [PMID: 30289472 PMCID: PMC6278859 DOI: 10.1093/molbev/msy190] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Despite the genetic resemblance of Canary Islanders to other southern European populations, their geographical isolation and the historical admixture of aborigines (from North Africa) with sub-Saharan Africans and Europeans have shaped a distinctive genetic makeup that likely affects disease susceptibility and health disparities. Based on single nucleotide polymorphism array data and whole genome sequencing (30×), we inferred that the last African admixture took place ∼14 generations ago and estimated that up to 34% of the Canary Islander genome is of recent African descent. The length of regions in homozygosis and the ancestry-related mosaic organization of the Canary Islander genome support the view that isolation has been strongest on the two smallest islands. Furthermore, several genomic regions showed significant and large deviations in African or European ancestry and were significantly enriched in genes involved in prevalent diseases in this community, such as diabetes, asthma, and allergy. The most prominent of these regions were located near LCT and the HLA, two well-known targets of selection, at which 40‒50% of the Canarian genome is of recent African descent according to our estimates. Putative selective signals were also identified in these regions near the SLC6A11-SLC6A1, KCNMB2, and PCDH20-PCDH9 genes. Taken together, our findings provide solid evidence of a significant recent African admixture, population isolation, and adaptation in this part of Europe, with the favoring of African alleles in some chromosome regions. These findings may have medical implications for populations of recent African ancestry.
Collapse
Affiliation(s)
- Beatriz Guillen-Guio
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - Jose M Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | | | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Itahisa Marcelino-Rodríguez
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - Almudena Corrales
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | - Antonio Cabrera de León
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - Santos Alonso
- Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country UPV/EHU, Leioa, Bizkaia, Spain
| | - Carlos Flores
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| |
Collapse
|
73
|
Booker TR, Keightley PD. Understanding the Factors That Shape Patterns of Nucleotide Diversity in the House Mouse Genome. Mol Biol Evol 2019; 35:2971-2988. [PMID: 30295866 PMCID: PMC6278861 DOI: 10.1093/molbev/msy188] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A major goal of population genetics has been to determine the extent by which selection at linked sites influences patterns of neutral nucleotide diversity in the genome. Multiple lines of evidence suggest that diversity is influenced by both positive and negative selection. For example, in many species there are troughs in diversity surrounding functional genomic elements, consistent with the action of either background selection (BGS) or selective sweeps. In this study, we investigated the causes of the diversity troughs that are observed in the wild house mouse genome. Using the unfolded site frequency spectrum, we estimated the strength and frequencies of deleterious and advantageous mutations occurring in different functional elements in the genome. We then used these estimates to parameterize forward-in-time simulations of chromosomes, using realistic distributions of functional elements and recombination rate variation in order to determine whether selection at linked sites can explain the observed patterns of nucleotide diversity. The simulations suggest that BGS alone cannot explain the dips in diversity around either exons or conserved noncoding elements. A combination of BGS and selective sweeps produces deeper dips in diversity than BGS alone, but the inferred parameters of selection cannot fully explain the patterns observed in the genome. Our results provide evidence of sweeps shaping patterns of nucleotide diversity across the mouse genome and also suggest that infrequent, strongly advantageous mutations play an important role in this. The limitations of using the unfolded site frequency spectrum for inferring the frequency and effects of advantageous mutations are discussed.
Collapse
Affiliation(s)
- Tom R Booker
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom.,Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Peter D Keightley
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
74
|
Adams RH, Schield DR, Castoe TA. Recent Advances in the Inference of Gene Flow from Population Genomic Data. ACTA ACUST UNITED AC 2019. [DOI: 10.1007/s40610-019-00120-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
75
|
Exploiting selection at linked sites to infer the rate and strength of adaptation. Nat Ecol Evol 2019; 3:977-984. [PMID: 31061475 PMCID: PMC6693860 DOI: 10.1038/s41559-019-0890-6] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 03/28/2019] [Indexed: 12/18/2022]
Abstract
Genomic data encodes past evolutionary events and has the potential to reveal the strength, rate, and biological drivers of adaptation. However, jointly estimating adaptation rate (a) and adaptation strength remains challenging because evolutionary processes such as demography, linkage, and non-neutral polymorphism can confound inference. Here, we exploit the influence of background selection to reduce the fixation rate of weakly-beneficial alleles to jointly infer the strength and rate of adaptation. We develop an MK-based method (ABC-MK) to infer adaptation rate and strength, and estimate α = 0.135 in human protein-coding sequences, 72% of which is contributed by weakly-adaptive variants. We show that in this adaptation regime α is reduced ≈ 25% by linkage genome-wide. Moreover, we show that virus-interacting proteins (VIPs) undergo adaptation that is both stronger and nearly twice as frequent as the genome average (α = 0.224, 56% due to strongly-beneficial alleles). Our results suggest that while most adaptation in human proteins is weakly-beneficial, adaptation to viruses is often strongly-beneficial. Our method provides a robust framework for estimating adaptation rate and strength across species.
Collapse
|
76
|
Williams MJ, Sottoriva A, Graham TA. Measuring Clonal Evolution in Cancer with Genomics. Annu Rev Genomics Hum Genet 2019; 20:309-329. [PMID: 31059289 DOI: 10.1146/annurev-genom-083117-021712] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Cancers originate from somatic cells in the human body that have accumulated genetic alterations. These mutations modify the phenotype of the cells, allowing them to escape the homeostatic regulation that maintains normal cell number. Viewed through the lens of evolutionary biology, the transformation of normal cells into malignant cells is evolution in action. Evolution continues throughout cancer growth, progression, treatment resistance, and disease relapse, driven by adaptation to changes in the cancer's environment, and intratumor heterogeneity is an inevitable consequence of this evolutionary process. Genomics provides a powerful means to characterize tumor evolution, enabling quantitative measurement of evolving clones across space and time. In this review, we discuss concepts and approaches to quantify and measure this evolutionary process in cancer using genomics.
Collapse
Affiliation(s)
- Marc J Williams
- Evolution and Cancer Laboratory, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ, United Kingdom; ,
| | - Andrea Sottoriva
- Evolutionary Genomics and Modelling Lab, Centre for Evolution and Cancer, The Institute of Cancer Research, London SM2 5NG, United Kingdom
| | - Trevor A Graham
- Evolution and Cancer Laboratory, Barts Cancer Institute, Queen Mary University of London, London EC1M 6BQ, United Kingdom; ,
| |
Collapse
|
77
|
Bidirectional Selection for Body Weight on Standing Genetic Variation in a Chicken Model. G3-GENES GENOMES GENETICS 2019; 9:1165-1173. [PMID: 30737239 PMCID: PMC6469407 DOI: 10.1534/g3.119.400038] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Experimental populations of model organisms provide valuable opportunities to unravel the genomic impact of selection in a controlled system. The Virginia body weight chicken lines represent a unique resource to investigate signatures of selection in a system where long-term, single-trait, bidirectional selection has been carried out for more than 60 generations. At 55 generations of divergent selection, earlier analyses of pooled genome resequencing data from these lines revealed that 14.2% of the genome showed extreme differentiation between the selected lines, contained within 395 genomic regions. Here, we report more detailed analyses of these data exploring the regions displaying within- and between-line genomic signatures of the bidirectional selection applied in these lines. Despite the strict selection regime for opposite extremes in body weight, this did not result in opposite genomic signatures between the lines. The lines often displayed a duality of the sweep signatures, where an extended region of homozygosity in one line, in contrast to mosaic pattern of heterozygosity in the other line. These haplotype mosaics consisted of short, distinct haploblocks of variable between-line divergence, likely the results of a complex demographic history involving bottlenecks, introgressions and moderate inbreeding. We demonstrate this using the example of complex haplotype mosaicism in the growth1 QTL. These mosaics represent the standing genetic variation available at the onset of selection in the founder population. Selection on standing genetic variation can thus result in different signatures depending on the intensity and direction of selection.
Collapse
|
78
|
Abstract
In this perspective, we evaluate the explanatory power of the neutral theory of molecular evolution, 50 years after its introduction by Kimura. We argue that the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality. The ubiquity of adaptive variation both within and between species means that a more comprehensive theory of molecular evolution must be sought.
Collapse
Affiliation(s)
- Andrew D Kern
- Department of Genetics, Rutgers University, Piscataway, NJ
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University Bloomington, IN
| |
Collapse
|
79
|
Uricchio LH, Kitano HC, Gusev A, Zaitlen NA. An evolutionary compass for detecting signals of polygenic selection and mutational bias. Evol Lett 2019; 3:69-79. [PMID: 30788143 PMCID: PMC6369964 DOI: 10.1002/evl3.97] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 12/03/2018] [Accepted: 12/10/2018] [Indexed: 12/17/2022] Open
Abstract
Selection and mutation shape the genetic variation underlying human traits, but the specific evolutionary mechanisms driving complex trait variation are largely unknown. We developed a statistical method that uses polarized genome-wide association study (GWAS) summary statistics from a single population to detect signals of mutational bias and selection. We found evidence for nonneutral signals on variation underlying several traits (body mass index [BMI], schizophrenia, Crohn's disease, educational attainment, and height). We then used simulations that incorporate simultaneous negative and positive selection to show that these signals are consistent with mutational bias and shifts in the fitness-phenotype relationship, but not stabilizing selection or mutational bias alone. We additionally replicate two of our top three signals (BMI and educational attainment) in an external cohort, and show that population stratification may have confounded GWAS summary statistics for height in the GIANT cohort. Our results provide a flexible and powerful framework for evolutionary analysis of complex phenotypes in humans and other species, and offer insights into the evolutionary mechanisms driving variation in human polygenic traits.
Collapse
Affiliation(s)
| | - Hugo C. Kitano
- Department of Computer ScienceStanford UniversityStanfordCA
| | | | - Noah A. Zaitlen
- Department of MedicineUniversity of CaliforniaSan FranciscoCA
- Bioengineering and Therapeutic SciencesUniversity of CaliforniaSan FranciscoCA
| |
Collapse
|
80
|
Flagel L, Brandvain Y, Schrider DR. The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference. Mol Biol Evol 2019; 36:220-238. [PMID: 30517664 PMCID: PMC6367976 DOI: 10.1093/molbev/msy224] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Population-scale genomic data sets have given researchers incredible amounts of information from which to infer evolutionary histories. Concomitant with this flood of data, theoretical and methodological advances have sought to extract information from genomic sequences to infer demographic events such as population size changes and gene flow among closely related populations/species, construct recombination maps, and uncover loci underlying recent adaptation. To date, most methods make use of only one or a few summaries of the input sequences and therefore ignore potentially useful information encoded in the data. The most sophisticated of these approaches involve likelihood calculations, which require theoretical advances for each new problem, and often focus on a single aspect of the data (e.g., only allele frequency information) in the interest of mathematical and computational tractability. Directly interrogating the entirety of the input sequence data in a likelihood-free manner would thus offer a fruitful alternative. Here, we accomplish this by representing DNA sequence alignments as images and using a class of deep learning methods called convolutional neural networks (CNNs) to make population genetic inferences from these images. We apply CNNs to a number of evolutionary questions and find that they frequently match or exceed the accuracy of current methods. Importantly, we show that CNNs perform accurate evolutionary model selection and parameter estimation, even on problems that have not received detailed theoretical treatments. Thus, when applied to population genetic alignments, CNNs are capable of outperforming expert-derived statistical methods and offer a new path forward in cases where no likelihood approach exists.
Collapse
Affiliation(s)
- Lex Flagel
- Monsanto Company, Chesterfield, MO
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN
| | - Yaniv Brandvain
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
81
|
Abstract
Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which they are calculated. However, due to linkage disequilibrium, summary statistics calculated at nearby genomic positions are highly correlated. We introduce an approach termed Trendsetter that accounts for the similarity of statistics calculated from adjacent genomic regions through trend filtering, while reducing the effects of multicollinearity through regularization. Our penalized regression framework has high power to detect sweeps, is capable of classifying sweep regions as either hard or soft, and can be applied to other selection scenarios as well. We find that Trendsetter is robust to both extensive missing data and strong background selection, and has comparable power to similar current approaches. Moreover, the model learned by Trendsetter can be viewed as a set of curves modeling the spatial distribution of summary statistics in the genome. Application to human genomic data revealed positively selected regions previously discovered such as LCT in Europeans and EDAR in East Asians. We also identified a number of novel candidates and show that populations with greater relatedness share more sweep signals.
Collapse
Affiliation(s)
- Mehreen R Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University,University Park, PA
- Institute for CyberScience, Pennsylvania State University, University Park, PA
| |
Collapse
|
82
|
Jensen JD, Payseur BA, Stephan W, Aquadro CF, Lynch M, Charlesworth D, Charlesworth B. The importance of the Neutral Theory in 1968 and 50 years on: A response to Kern and Hahn 2018. Evolution 2019; 73:111-114. [PMID: 30460993 PMCID: PMC6496948 DOI: 10.1111/evo.13650] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 11/09/2018] [Indexed: 01/31/2023]
Abstract
A recent article reassessing the Neutral Theory of Molecular Evolution claims that it is no longer as important as is widely believed. The authors argue that "the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality." Claiming that "the neutral theory has been overwhelmingly rejected," they propose instead that natural selection is the major force shaping both between-species divergence and within-species variation. Although this is probably a minority view, it is important to evaluate such claims carefully in the context of current knowledge, as inaccuracies can sometimes morph into an accepted narrative for those not familiar with the underlying science. We here critically examine and ultimately reject Kern and Hahn's arguments and assessment, and instead propose that it is now abundantly clear that the foundational ideas presented five decades ago by Kimura and Ohta are indeed correct.
Collapse
Affiliation(s)
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison,
Madison, Wisconsin
| | - Wolfgang Stephan
- Leibniz-Institute for Evolution and Biodiversity Science,
Berlin, Germany
| | - Charles F. Aquadro
- Department of Molecular Biology & Genetics, Cornell
University, Ithaca, New York
| | - Michael Lynch
- Center for Mechanisms of Evolution, Arizona State
University, Tempe, Arizona
| | - Deborah Charlesworth
- Institute of Evolutionary Biology, School of Biological
Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological
Sciences, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
83
|
Harris RB, Sackman A, Jensen JD. On the unfounded enthusiasm for soft selective sweeps II: Examining recent evidence from humans, flies, and viruses. PLoS Genet 2018; 14:e1007859. [PMID: 30592709 PMCID: PMC6336318 DOI: 10.1371/journal.pgen.1007859] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 01/17/2019] [Accepted: 11/28/2018] [Indexed: 12/13/2022] Open
Abstract
Since the initial description of the genomic patterns expected under models of positive selection acting on standing genetic variation and on multiple beneficial mutations—so-called soft selective sweeps—researchers have sought to identify these patterns in natural population data. Indeed, over the past two years, large-scale data analyses have argued that soft sweeps are pervasive across organisms of very different effective population size and mutation rate—humans, Drosophila, and HIV. Yet, others have evaluated the relevance of these models to natural populations, as well as the identifiability of the models relative to other known population-level processes, arguing that soft sweeps are likely to be rare. Here, we look to reconcile these opposing results by carefully evaluating three recent studies and their underlying methodologies. Using population genetic theory, as well as extensive simulation, we find that all three examples are prone to extremely high false-positive rates, incorrectly identifying soft sweeps under both hard sweep and neutral models. Furthermore, we demonstrate that well-fit demographic histories combined with rare hard sweeps serve as the more parsimonious explanation. These findings represent a necessary response to the growing tendency of invoking parameter-heavy, assumption-laden models of pervasive positive selection, and neglecting best practices regarding the construction of proper demographic null models. A long-standing debate in evolutionary biology revolves around the role of selective vs. stochastic processes in driving molecular evolution and shaping genetic variation. With the advent of genomics, genome-wide polymorphism data have been utilized to characterize these processes, with a major interest in describing the fraction of genomic variation shaped by positive selection. These genomic scans were initially focused around a hard sweep model, in which selection acts upon rare, newly arising beneficial mutations. Recent years have seen the description of sweeps occurring from both standing and rapidly recurring beneficial mutations, collectively known as soft sweeps. However, common to both hard and soft sweeps is the difficulty in distinguishing these effects from neutral demographic patterns, and disentangling these processes has remained an important field of study within population genetics. Despite this, there is a recent and troubling tendency to neglect these demographic considerations, and to naively fit sweep models to genomic data. Recent realizations of such efforts have resulted in the claim that soft sweeps play a dominant role in shaping genomic variation and in driving adaptation across diverse branches of the tree of life. Here, we reanalyze these findings and demonstrate that a more careful consideration of neutral processes results in highly differing conclusions.
Collapse
Affiliation(s)
- Rebecca B. Harris
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Andrew Sackman
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
- * E-mail:
| |
Collapse
|
84
|
|
85
|
Gnecchi-Ruscone GA, Abondio P, De Fanti S, Sarno S, Sherpa MG, Sherpa PT, Marinelli G, Natali L, Di Marcello M, Peluzzi D, Luiselli D, Pettener D, Sazzini M. Evidence of Polygenic Adaptation to High Altitude from Tibetan and Sherpa Genomes. Genome Biol Evol 2018; 10:2919-2930. [PMID: 30335146 PMCID: PMC6239493 DOI: 10.1093/gbe/evy233] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/10/2018] [Indexed: 12/13/2022] Open
Abstract
Although Tibetans and Sherpa present several physiological adjustments evolved to cope with selective pressures imposed by the high-altitude environment, especially hypobaric hypoxia, few selective sweeps at a limited number of hypoxia related genes were confirmed by multiple genomic studies. Nevertheless, variants at these loci were found to be associated only with downregulation of the erythropoietic cascade, which represents an indirect aspect of the considered adaptive phenotype. Accordingly, the genetic basis of Tibetan/Sherpa adaptive traits remains to be fully elucidated, in part due to limitations of selection scans implemented so far and mostly relying on the hard sweep model. In order to overcome this issue, we used whole-genome sequence data and several selection statistics as input for gene network analyses aimed at testing for the occurrence of polygenic adaptation in these high-altitude Himalayan populations. Being able to detect also subtle genomic signatures ascribable to weak positive selection at multiple genes of the same functional subnetwork, this approach allowed us to infer adaptive evolution at loci individually showing small effect sizes, but belonging to highly interconnected biological pathways overall involved in angiogenetic processes. Therefore, these findings pinpointed a series of selective events neglected so far, which likely contributed to the augmented tissue blood perfusion observed in Tibetans and Sherpa, thus uncovering the genetic determinants of a key biological mechanism that underlies their adaptation to high altitude.
Collapse
Affiliation(s)
- Guido A Gnecchi-Ruscone
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Paolo Abondio
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Sara De Fanti
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Stefania Sarno
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | | | | | | | - Luca Natali
- Explora Nunaat International, Montorio al Vomano, Teramo, Italy.,Italian Institute of Human Paleontology, Rome, Italy
| | | | - Davide Peluzzi
- Explora Nunaat International, Montorio al Vomano, Teramo, Italy
| | - Donata Luiselli
- Department of Cultural Heritage, University of Bologna, Ravenna, Italy
| | - Davide Pettener
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Marco Sazzini
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| |
Collapse
|
86
|
Detection and Classification of Hard and Soft Sweeps from Unphased Genotypes by Multilocus Genotype Identity. Genetics 2018; 210:1429-1452. [PMID: 30315068 DOI: 10.1534/genetics.118.301502] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Accepted: 10/08/2018] [Indexed: 11/18/2022] Open
Abstract
Positive natural selection can lead to a decrease in genomic diversity at the selected site and at linked sites, producing a characteristic signature of elevated expected haplotype homozygosity. These selective sweeps can be hard or soft. In the case of a hard selective sweep, a single adaptive haplotype rises to high population frequency, whereas multiple adaptive haplotypes sweep through the population simultaneously in a soft sweep, producing distinct patterns of genetic variation in the vicinity of the selected site. Measures of expected haplotype homozygosity have previously been used to detect sweeps in multiple study systems. However, these methods are formulated for phased haplotype data, typically unavailable for nonmodel organisms, and some may have reduced power to detect soft sweeps due to their increased genetic diversity relative to hard sweeps. To address these limitations, we applied the H12 and H2/H1 statistics proposed in 2015 by Garud et al., which have power to detect both hard and soft sweeps, to unphased multilocus genotypes, denoting them as G12 and G2/G1. G12 (and the more direct expected homozygosity analog to H12, denoted G123) has comparable power to H12 for detecting both hard and soft sweeps. G2/G1 can be used to classify hard and soft sweeps analogously to H2/H1, conditional on a genomic region having high G12 or G123 values. The reason for this power is that, under random mating, the most frequent haplotypes will yield the most frequent multilocus genotypes. Simulations based on parameters compatible with our recent understanding of human demographic history suggest that expected homozygosity methods are best suited for detecting recent sweeps, and increase in power under recent population expansions. Finally, we find candidates for selective sweeps within the 1000 Genomes CEU, YRI, GIH, and CHB populations, which corroborate and complement existing studies.
Collapse
|
87
|
Complex Haplotypes of GSTM1 Gene Deletions Harbor Signatures of a Selective Sweep in East Asian Populations. G3-GENES GENOMES GENETICS 2018; 8:2953-2966. [PMID: 30061374 PMCID: PMC6118300 DOI: 10.1534/g3.118.200462] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The deletion of the metabolizing Glutathione S-transferase Mu 1 (GSTM1) gene has been associated with multiple cancers, metabolic and autoimmune disorders, as well as drug response. It is unusually common, with allele frequency reaching up to 75% in some human populations. Such high allele frequency of a derived allele with apparent impact on an otherwise conserved gene is a rare phenomenon. To investigate the evolutionary history of this locus, we analyzed 310 genomes using population genetics tools. Our analysis revealed a surprising lack of linkage disequilibrium between the deletion and the flanking single nucleotide variants in this locus. Tests that measure extended homozygosity and rapid change in allele frequency revealed signatures of an incomplete sweep in the locus. Using empirical approaches, we identified the Tanuki haplogroup, which carries the GSTM1 deletion and is found in approximately 70% of East Asian chromosomes. This haplogroup has rapidly increased in frequency in East Asian populations, contributing to a high population differentiation among continental human groups. We showed that extended homozygosity and population differentiation for this haplogroup is incompatible with simulated neutral expectations in East Asian populations. In parallel, we found that the Tanuki haplogroup is significantly associated with the expression levels of other GSTM genes. Collectively, our results suggest that standing variation in this locus has likely undergone an incomplete sweep in East Asia with regulatory impact on multiple GSTM genes. Our study provides the necessary framework for further studies to elucidate the evolutionary reasons that maintain disease-susceptibility variants in the GSTM1 locus.
Collapse
|
88
|
Yurchenko AA, Daetwyler HD, Yudin N, Schnabel RD, Vander Jagt CJ, Soloshenko V, Lhasaranov B, Popov R, Taylor JF, Larkin DM. Scans for signatures of selection in Russian cattle breed genomes reveal new candidate genes for environmental adaptation and acclimation. Sci Rep 2018; 8:12984. [PMID: 30154520 PMCID: PMC6113280 DOI: 10.1038/s41598-018-31304-w] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 08/16/2018] [Indexed: 01/08/2023] Open
Abstract
Domestication and selective breeding has resulted in over 1000 extant cattle breeds. Many of these breeds do not excel in important traits but are adapted to local environments. These adaptations are a valuable source of genetic material for efforts to improve commercial breeds. As a step toward this goal we identified candidate regions to be under selection in genomes of nine Russian native cattle breeds adapted to survive in harsh climates. After comparing our data to other breeds of European and Asian origins we found known and novel candidate genes that could potentially be related to domestication, economically important traits and environmental adaptations in cattle. The Russian cattle breed genomes contained regions under putative selection with genes that may be related to adaptations to harsh environments (e.g., AQP5, RAD50, and RETREG1). We found genomic signatures of selective sweeps near key genes related to economically important traits, such as the milk production (e.g., DGAT1, ABCG2), growth (e.g., XKR4), and reproduction (e.g., CSF2). Our data point to candidate genes which should be included in future studies attempting to identify genes to improve the extant breeds and facilitate generation of commercial breeds that fit better into the environments of Russia and other countries with similar climates.
Collapse
Affiliation(s)
- Andrey A Yurchenko
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences (ICG SB RAS), 630090, Novosibirsk, Russia
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Hans D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, 3083, Victoria, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, 3083, Victoria, Australia
| | - Nikolay Yudin
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences (ICG SB RAS), 630090, Novosibirsk, Russia
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211-5300, USA
| | - Christy J Vander Jagt
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, 3083, Victoria, Australia
| | | | | | - Ruslan Popov
- Yakutian Research Institute of Agriculture, 677001, Yakutsk, Russia
| | - Jeremy F Taylor
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211-5300, USA
| | - Denis M Larkin
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences (ICG SB RAS), 630090, Novosibirsk, Russia.
- Royal Veterinary College, University of London, NW01 0TU, London, UK.
| |
Collapse
|
89
|
Lange JD, Pool JE. Impacts of Recurrent Hitchhiking on Divergence and Demographic Inference in Drosophila. Genome Biol Evol 2018; 10:1882-1891. [PMID: 30010915 PMCID: PMC6075209 DOI: 10.1093/gbe/evy142] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2018] [Indexed: 12/14/2022] Open
Abstract
In species with large population sizes such as Drosophila, natural selection may have substantial effects on genetic diversity and divergence. However, the implications of this widespread nonneutrality for standard population genetic assumptions and practices remain poorly resolved. Here, we assess the consequences of recurrent hitchhiking (RHH), in which selective sweeps occur at a given rate randomly across the genome. We use forward simulations to examine two published RHH models for D. melanogaster, reflecting relatively common/weak and rare/strong selection. We find that unlike the rare/strong RHH model, the common/weak model entails a slight degree of Hill-Robertson interference in high recombination regions. We also find that the common/weak RHH model is more consistent with our genome-wide estimate of the proportion of substitutions fixed by natural selection between D. melanogaster and D. simulans (19%). Finally, we examine how these models of RHH might bias demographic inference. We find that these RHH scenarios can bias demographic parameter estimation, but such biases are weaker for parameters relating recently diverged populations, and for the common/weak RHH model in general. Thus, even for species with important genome-wide impacts of selective sweeps, neutralist demographic inference can have some utility in understanding the histories of recently diverged populations.
Collapse
Affiliation(s)
- Jeremy D Lange
- Laboratory of Genetics, University of Wisconsin–Madison, Madison
| | - John E Pool
- Laboratory of Genetics, University of Wisconsin–Madison, Madison
| |
Collapse
|
90
|
Fujito NT, Satta Y, Hayakawa T, Takahata N. A new inference method for detecting an ongoing selective sweep. Genes Genet Syst 2018; 93:149-161. [DOI: 10.1266/ggs.18-00008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Naoko T. Fujito
- School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies)
| | - Yoko Satta
- School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies)
| | - Toshiyuki Hayakawa
- Graduate School of Systems Life Sciences, Kyushu University
- Faculty of Arts and Science, Kyushu University
| | - Naoyuki Takahata
- School of Advanced Sciences, SOKENDAI (The Graduate University for Advanced Studies)
| |
Collapse
|
91
|
Patel R, Scheinfeldt LB, Sanderford MD, Lanham TR, Tamura K, Platt A, Glicksberg BS, Xu K, Dudley JT, Kumar S. Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 2018; 35:2015-2025. [PMID: 29846678 PMCID: PMC6063297 DOI: 10.1093/molbev/msy107] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The human genome contains hundreds of thousands of missense mutations. However, only a handful of these variants are known to be adaptive, which implies that adaptation through protein sequence change is an extremely rare phenomenon in human evolution. Alternatively, existing methods may lack the power to pinpoint adaptive variation. We have developed and applied an Evolutionary Probability Approach (EPA) to discover candidate adaptive polymorphisms (CAPs) through the discordance between allelic evolutionary probabilities and their observed frequencies in human populations. EPA reveals thousands of missense CAPs, which suggest that a large number of previously optimal alleles experienced a reversal of fortune in the human lineage. We explored nonadaptive mechanisms to explain CAPs, including the effects of demography, mutation rate variability, and negative and positive selective pressures in modern humans. Many nonadaptive hypotheses were tested, but failed to explain the data, which suggests that a large proportion of CAP alleles have increased in frequency due to beneficial selection. This suggestion is supported by the fact that a vast majority of adaptive missense variants discovered previously in humans are CAPs, and hundreds of CAP alleles are protective in genotype-phenotype association data. Our integrated phylogenomic and population genetic EPA approach predicts the existence of thousands of nonneutral candidate variants in the human proteome. We expect this collection to be enriched in beneficial variation. The EPA approach can be applied to discover candidate adaptive variation in any protein, population, or species for which allele frequency data and reliable multispecies alignments are available.
Collapse
Affiliation(s)
- Ravi Patel
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
| | - Laura B Scheinfeldt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Coriell Institute for Medical Research, Camden, NJ
| | - Maxwell D Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Tamera R Lanham
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Koichiro Tamura
- Department of Biology, Tokyo Metropolitan University, Tokyo, Japan
| | - Alexander Platt
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| | - Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Ke Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
- Department of Biology, Temple University, Philadelphia, PA
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
92
|
Kern AD, Schrider DR. diploS/HIC: An Updated Approach to Classifying Selective Sweeps. G3 (BETHESDA, MD.) 2018; 8:1959-1970. [PMID: 29626082 PMCID: PMC5982824 DOI: 10.1534/g3.118.200262] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 04/04/2018] [Indexed: 11/18/2022]
Abstract
Identifying selective sweeps in populations that have complex demographic histories remains a difficult problem in population genetics. We previously introduced a supervised machine learning approach, S/HIC, for finding both hard and soft selective sweeps in genomes on the basis of patterns of genetic variation surrounding a window of the genome. While S/HIC was shown to be both powerful and precise, the utility of S/HIC was limited by the use of phased genomic data as input. In this report we describe a deep learning variant of our method, diploS/HIC, that uses unphased genotypes to accurately classify genomic windows. diploS/HIC is shown to be quite powerful even at moderate to small sample sizes.
Collapse
Affiliation(s)
- Andrew D Kern
- Department of Genetics, Rutgers University, Piscataway, NJ 08854
| | | |
Collapse
|
93
|
Llopart A. Faster‐X evolution of gene expression is driven by recessive adaptive
cis
‐regulatory variation in
Drosophila. Mol Ecol 2018; 27:3811-3821. [DOI: 10.1111/mec.14708] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Revised: 03/28/2018] [Accepted: 04/05/2018] [Indexed: 12/30/2022]
Affiliation(s)
- Ana Llopart
- Department of Biology The University of Iowa Iowa City Iowa
- Interdisciplinary Graduate Program in Genetics The University of Iowa Iowa City Iowa
| |
Collapse
|
94
|
Gokcumen O. The Year In Genetic Anthropology: New Lands, New Technologies, New Questions. AMERICAN ANTHROPOLOGIST 2018. [DOI: 10.1111/aman.13032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Omer Gokcumen
- Department of Biological Sciences University of Buffalo NY 14260 USA
| |
Collapse
|
95
|
Schrider DR, Ayroles J, Matute DR, Kern AD. Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLoS Genet 2018; 14:e1007341. [PMID: 29684059 PMCID: PMC5933812 DOI: 10.1371/journal.pgen.1007341] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Revised: 05/03/2018] [Accepted: 03/28/2018] [Indexed: 12/30/2022] Open
Abstract
Hybridization and gene flow between species appears to be common. Even though it is clear that hybridization is widespread across all surveyed taxonomic groups, the magnitude and consequences of introgression are still largely unknown. Thus it is crucial to develop the statistical machinery required to uncover which genomic regions have recently acquired haplotypes via introgression from a sister population. We developed a novel machine learning framework, called FILET (Finding Introgressed Loci via Extra-Trees) capable of revealing genomic introgression with far greater power than competing methods. FILET works by combining information from a number of population genetic summary statistics, including several new statistics that we introduce, that capture patterns of variation across two populations. We show that FILET is able to identify loci that have experienced gene flow between related species with high accuracy, and in most situations can correctly infer which population was the donor and which was the recipient. Here we describe a data set of outbred diploid Drosophila sechellia genomes, and combine them with data from D. simulans to examine recent introgression between these species using FILET. Although we find that these populations may have split more recently than previously appreciated, FILET confirms that there has indeed been appreciable recent introgression (some of which might have been adaptive) between these species, and reveals that this gene flow is primarily in the direction of D. simulans to D. sechellia. Understanding the extent to which species or diverged populations hybridize in nature is crucially important if we are to understand the speciation process. Accordingly numerous research groups have developed methodology for finding the genetic evidence of such introgression. In this report we develop a supervised machine learning approach for uncovering loci which have introgressed across species boundaries. We show that our method, FILET, has greater accuracy and power than competing methods in discovering introgression, and in addition can detect the directionality associated with the gene flow between species. Using whole genome sequences from Drosophila simulans and Drosophila sechellia we show that FILET discovers quite extensive introgression between these species that has occurred mostly from D. simulans to D. sechellia. Our work highlights the complex process of speciation even within a well-studied system and points to the growing importance of supervised machine learning in population genetics.
Collapse
Affiliation(s)
- Daniel R. Schrider
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| | - Julien Ayroles
- Ecology and Evolutionary Biology Department, Princeton University, Princeton, New Jersey, United States of America
- Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Daniel R. Matute
- Biology Department, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Andrew D. Kern
- Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, United States of America
| |
Collapse
|
96
|
Schrider DR, Kern AD. Supervised Machine Learning for Population Genetics: A New Paradigm. Trends Genet 2018; 34:301-312. [PMID: 29331490 PMCID: PMC5905713 DOI: 10.1016/j.tig.2017.12.005] [Citation(s) in RCA: 201] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 11/29/2017] [Accepted: 12/08/2017] [Indexed: 01/21/2023]
Abstract
As population genomic datasets grow in size, researchers are faced with the daunting task of making sense of a flood of information. To keep pace with this explosion of data, computational methodologies for population genetic inference are rapidly being developed to best utilize genomic sequence data. In this review we discuss a new paradigm that has emerged in computational population genomics: that of supervised machine learning (ML). We review the fundamentals of ML, discuss recent applications of supervised ML to population genetics that outperform competing methods, and describe promising future directions in this area. Ultimately, we argue that supervised ML is an important and underutilized tool that has considerable potential for the world of evolutionary genomics.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, and Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 08554, USA.
| | - Andrew D Kern
- Department of Genetics, and Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 08554, USA.
| |
Collapse
|
97
|
Sugden LA, Atkinson EG, Fischer AP, Rong S, Henn BM, Ramachandran S. Localization of adaptive variants in human genomes using averaged one-dependence estimation. Nat Commun 2018; 9:703. [PMID: 29459739 PMCID: PMC5818606 DOI: 10.1038/s41467-018-03100-7] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 01/19/2018] [Indexed: 12/19/2022] Open
Abstract
Statistical methods for identifying adaptive mutations from population genetic data face several obstacles: assessing the significance of genomic outliers, integrating correlated measures of selection into one analytic framework, and distinguishing adaptive variants from hitchhiking neutral variants. Here, we introduce SWIF(r), a probabilistic method that detects selective sweeps by learning the distributions of multiple selection statistics under different evolutionary scenarios and calculating the posterior probability of a sweep at each genomic site. SWIF(r) is trained using simulations from a user-specified demographic model and explicitly models the joint distributions of selection statistics, thereby increasing its power to both identify regions undergoing sweeps and localize adaptive mutations. Using array and exome data from 45 ‡Khomani San hunter-gatherers of southern Africa, we identify an enrichment of adaptive signals in genes associated with metabolism and obesity. SWIF(r) provides a transparent probabilistic framework for localizing beneficial mutations that is extensible to a variety of evolutionary scenarios.
Collapse
Affiliation(s)
- Lauren Alpert Sugden
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA.
| | - Elizabeth G Atkinson
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Annie P Fischer
- Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA
| | - Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Brenna M Henn
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
98
|
Nelson TC, Cresko WA. Ancient genomic variation underlies repeated ecological adaptation in young stickleback populations. Evol Lett 2018; 2:9-21. [PMID: 30283661 PMCID: PMC6121857 DOI: 10.1002/evl3.37] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Revised: 12/16/2017] [Accepted: 12/19/2017] [Indexed: 12/17/2022] Open
Abstract
Adaptation in the wild often involves standing genetic variation (SGV), which allows rapid responses to selection on ecological timescales. However, we still know little about how the evolutionary histories and genomic distributions of SGV influence local adaptation in natural populations. Here, we address this knowledge gap using the threespine stickleback fish (Gasterosteus aculeatus) as a model. We extend restriction site-associated DNA sequencing (RAD-seq) to produce phased haplotypes approaching 700 base pairs (bp) in length at each of over 50,000 loci across the stickleback genome. Parallel adaptation in two geographically isolated freshwater pond populations consistently involved fixation of haplotypes that are identical-by-descent. In these same genomic regions, sequence divergence between marine and freshwater stickleback, as measured by dXY , reaches tenfold higher than background levels and genomic variation is structured into distinct marine and freshwater haplogroups. By combining this dataset with a de novo genome assembly of a related species, the ninespine stickleback (Pungitius pungitius), we find that this habitat-associated divergent variation averages six million years old, nearly twice the genome-wide average. The genomic variation that is involved in recent and rapid local adaptation in stickleback has therefore been evolving throughout the 15-million-year history since the two species lineages split. This long history of genomic divergence has maintained large genomic regions of ancient ancestry that include multiple chromosomal inversions and extensive linked variation. These discoveries of ancient genetic variation spread broadly across the genome in stickleback demonstrate how selection on ecological timescales is a result of genome evolution over geological timescales, and vice versa.
Collapse
Affiliation(s)
- Thomas C Nelson
- Institute of Ecology and Evolution University of Oregon Eugene, Oregon 97403.,Current Address: Division of Biological Sciences University of Montana Missoula, Montana 59812
| | - William A Cresko
- Institute of Ecology and Evolution University of Oregon Eugene, Oregon 97403
| |
Collapse
|
99
|
Luikart G, Kardos M, Hand BK, Rajora OP, Aitken SN, Hohenlohe PA. Population Genomics: Advancing Understanding of Nature. POPULATION GENOMICS 2018. [DOI: 10.1007/13836_2018_60] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
100
|
Cheng X, Xu C, DeGiorgio M. Fast and robust detection of ancestral selective sweeps. Mol Ecol 2017; 26:6871-6891. [DOI: 10.1111/mec.14416] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 10/16/2017] [Accepted: 10/23/2017] [Indexed: 01/01/2023]
Affiliation(s)
- Xiaoheng Cheng
- Huck Institutes of Life Sciences; Pennsylvania State University; University Park PA USA
- Department of Biology; Pennsylvania State University; University Park PA USA
| | - Cheng Xu
- Huck Institutes of Life Sciences; Pennsylvania State University; University Park PA USA
| | - Michael DeGiorgio
- Department of Biology; Pennsylvania State University; University Park PA USA
- Department of Statistics; Pennsylvania State University; University Park PA USA
- Institute for CyberScience; Pennsylvania State University; University Park PA USA
| |
Collapse
|