1
|
Di Santo LN, Quilodrán CS, Currat M. Temporal Variation in Introgressed Segments' Length Statistics Computed from a Limited Number of Ancient Genomes Sheds Light on Past Admixture Pulses. Mol Biol Evol 2023; 40:msad252. [PMID: 37992125 PMCID: PMC10715198 DOI: 10.1093/molbev/msad252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 10/16/2023] [Accepted: 11/09/2023] [Indexed: 11/24/2023] Open
Abstract
Hybridization is recognized as an important evolutionary force, but identifying and timing admixture events between divergent lineages remain a major aim of evolutionary biology. While this has traditionally been done using inferential tools on contemporary genomes, the latest advances in paleogenomics have provided a growing wealth of temporally distributed genomic data. Here, we used individual-based simulations to generate chromosome-level genomic data for a 2-population system and described temporal neutral introgression patterns under a single- and 2-pulse admixture model. We computed 6 summary statistics aiming to inform the timing and number of admixture pulses between interbreeding entities: lengths of introgressed sequences and their variance within genomes, as well as genome-wide introgression proportions and related measures. The first 2 statistics could confidently be used to infer interlineage hybridization history, peaking at the beginning and shortly after an admixture pulse. Temporal variation in introgression proportions and related statistics provided more limited insights, particularly when considering their application to ancient genomes still scant in number. Lastly, we computed these statistics on Homo sapiens paleogenomes and successfully inferred the hybridization pulse from Neanderthal that occurred approximately 40 to 60 kya. The scarce number of genomes dating from this period prevented more precise inferences, but the accumulation of paleogenomic data opens promising perspectives as our approach only requires a limited number of ancient genomes.
Collapse
Affiliation(s)
- Lionel N Di Santo
- Department of Genetics and Evolution, University of Geneva, Geneva CH-1205
| | | | - Mathias Currat
- Department of Genetics and Evolution, University of Geneva, Geneva CH-1205
- Institute of Genetics and Genomics in Geneva (IGE3), University of Geneva, Geneva CH-1205
| |
Collapse
|
2
|
Nugent CM, Kess T, Brachmann MK, Langille BL, Holborn MK, Beck SV, Smith N, Duffy SJ, Lehnert SJ, Wringe BF, Bentzen P, Bradbury IR. Genomic and machine learning-based screening of aquaculture-associated introgression into at-risk wild North American Atlantic salmon (Salmo salar) populations. Mol Ecol Resour 2023. [PMID: 37246351 DOI: 10.1111/1755-0998.13811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 04/27/2023] [Accepted: 05/02/2023] [Indexed: 05/30/2023]
Abstract
The negative genetic impacts of gene flow from domestic to wild populations can be dependent on the degree of domestication and exacerbated by the magnitude of pre-existing genetic differences between wild populations and the domestication source. Recent evidence of European ancestry within North American aquaculture Atlantic salmon (Salmo salar) has elevated the potential impact of escaped farmed salmon on often at-risk wild North American salmon populations. Here, we compare the ability of single nucleotide polymorphism (SNP) and microsatellite (SSR) marker panels of different sizes (7-SSR, 100-SSR and 220K-SNP) to detect introgression of European genetic information into North American wild and aquaculture populations. Linear regression comparing admixture predictions for a set of individuals common to the three datasets showed that the 100-SSR panel and 7-SSR panels replicated the full 220K-SNP-based admixture estimates with low accuracy (r2 of .64 and .49, respectively). Additional tests explored the effects of individual sample size and marker number, which revealed that ~300 randomly selected SNPs could replicate the 220K-SNP admixture predictions with greater than 95% fidelity. We designed a custom SNP panel (301-SNP) for European admixture detection in future monitoring work and then developed and tested a python package, salmoneuadmix (https://github.com/CNuge/SalmonEuAdmix), which uses a deep neural network to make de novo estimates of individuals' European admixture proportion without the need to conduct complete admixture analysis utilizing baseline samples. The results demonstrate the mobilization of targeted SNP panels and machine learning in support of at-risk species conservation and management.
Collapse
Affiliation(s)
- Cameron M Nugent
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Tony Kess
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Matthew K Brachmann
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Barbara L Langille
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Melissa K Holborn
- Fisheries and Oceans Canada, Bedford Institute of Oceanography, Dartmouth, Nova Scotia, Canada
| | - Samantha V Beck
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
- Biology Department, Dalhousie University, Halifax, Nova Scotia, Canada
- Institute for Biodiversity and Freshwater Conservation, University of the Highlands and Islands, Inverness, UK
| | - Nicole Smith
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Steven J Duffy
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Sarah J Lehnert
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| | - Brendan F Wringe
- Fisheries and Oceans Canada, Bedford Institute of Oceanography, Dartmouth, Nova Scotia, Canada
| | - Paul Bentzen
- Biology Department, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Ian R Bradbury
- Fisheries and Oceans Canada, Northwest Atlantic Fisheries Centre, St. John's, Newfoundland and Labrador, Canada
| |
Collapse
|
3
|
Gopalan S, Smith SP, Korunes K, Hamid I, Ramachandran S, Goldberg A. Human genetic admixture through the lens of population genomics. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200410. [PMID: 35430881 PMCID: PMC9014191 DOI: 10.1098/rstb.2020.0410] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Over the past 50 years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, ‘The Apportionment of Human Diversity’. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene flow between two or more previously separated source populations to form a new admixed population. The admixture process introduces ancestry-based structure into patterns of genetic variation within and between populations, which in turn influences the inference of demographic histories, identification of genetic targets of selection and prediction of complex traits. In this review, we outline some challenges for admixture population genetics, including limitations of applying methods designed for populations without recent admixture to the study of admixed populations. We highlight recent studies and methodological advances that aim to overcome such challenges, leveraging genomic signatures of admixture that occurred in the past tens of generations to gain insights into human history, natural selection and complex trait architecture. This article is part of the theme issue ‘Celebrating 50 years since Lewontin's apportionment of human diversity’.
Collapse
Affiliation(s)
- Shyamalika Gopalan
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Katharine Korunes
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Iman Hamid
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
- Data Science Initiative, Brown University, Providence, RI 02912, USA
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
4
|
Cuadros-Espinoza S, Laval G, Quintana-Murci L, Patin E. The genomic signatures of natural selection in admixed human populations. Am J Hum Genet 2022; 109:710-726. [PMID: 35259336 DOI: 10.1016/j.ajhg.2022.02.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 02/14/2022] [Indexed: 12/15/2022] Open
Abstract
Admixture has been a pervasive phenomenon in human history, extensively shaping the patterns of population genetic diversity. There is increasing evidence to suggest that admixture can also facilitate genetic adaptation to local environments, i.e., admixed populations acquire beneficial mutations from source populations, a process that we refer to as "adaptive admixture." However, the role of adaptive admixture in human evolution and the power to detect it remain poorly characterized. Here, we use extensive computer simulations to evaluate the power of several neutrality statistics to detect natural selection in the admixed population, assuming multiple admixture scenarios. We show that statistics based on admixture proportions, Fadm and LAD, show high power to detect mutations that are beneficial in the admixed population, whereas other statistics, including iHS and FST, falsely detect neutral mutations that have been selected in the source populations only. By combining Fadm and LAD into a single, powerful statistic, we scanned the genomes of 15 worldwide, admixed populations for signatures of adaptive admixture. We confirm that lactase persistence and resistance to malaria have been under adaptive admixture in West Africans and in Malagasy, North Africans, and South Asians, respectively. Our approach also uncovers other cases of adaptive admixture, including APOL1 in Fulani nomads and PKN2 in East Indonesians, involved in resistance to infection and metabolism, respectively. Collectively, our study provides evidence that adaptive admixture has occurred in human populations whose genetic history is characterized by periods of isolation and spatial expansions resulting in increased gene flow.
Collapse
|
5
|
Fountain-Jones NM, Smith ML, Austerlitz F. Machine learning in molecular ecology. Mol Ecol Resour 2021; 21:2589-2597. [PMID: 34738721 DOI: 10.1111/1755-0998.13532] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 10/15/2021] [Accepted: 10/18/2021] [Indexed: 12/26/2022]
Affiliation(s)
| | - Megan L Smith
- Department of Biology, Indiana University, Bloomington, Indiana, USA
| | | |
Collapse
|
6
|
Choin J, Mendoza-Revilla J, Arauna LR, Cuadros-Espinoza S, Cassar O, Larena M, Ko AMS, Harmant C, Laurent R, Verdu P, Laval G, Boland A, Olaso R, Deleuze JF, Valentin F, Ko YC, Jakobsson M, Gessain A, Excoffier L, Stoneking M, Patin E, Quintana-Murci L. Genomic insights into population history and biological adaptation in Oceania. Nature 2021; 592:583-589. [PMID: 33854233 DOI: 10.1038/s41586-021-03236-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Accepted: 01/13/2021] [Indexed: 12/27/2022]
Abstract
The Pacific region is of major importance for addressing questions regarding human dispersals, interactions with archaic hominins and natural selection processes1. However, the demographic and adaptive history of Oceanian populations remains largely uncharacterized. Here we report high-coverage genomes of 317 individuals from 20 populations from the Pacific region. We find that the ancestors of Papuan-related ('Near Oceanian') groups underwent a strong bottleneck before the settlement of the region, and separated around 20,000-40,000 years ago. We infer that the East Asian ancestors of Pacific populations may have diverged from Taiwanese Indigenous peoples before the Neolithic expansion, which is thought to have started from Taiwan around 5,000 years ago2-4. Additionally, this dispersal was not followed by an immediate, single admixture event with Near Oceanian populations, but involved recurrent episodes of genetic interactions. Our analyses reveal marked differences in the proportion and nature of Denisovan heritage among Pacific groups, suggesting that independent interbreeding with highly structured archaic populations occurred. Furthermore, whereas introgression of Neanderthal genetic information facilitated the adaptation of modern humans related to multiple phenotypes (for example, metabolism, pigmentation and neuronal development), Denisovan introgression was primarily beneficial for immune-related functions. Finally, we report evidence of selective sweeps and polygenic adaptation associated with pathogen exposure and lipid metabolism in the Pacific region, increasing our understanding of the mechanisms of biological adaptation to island environments.
Collapse
Affiliation(s)
- Jeremy Choin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | | | - Lara R Arauna
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France
| | - Sebastian Cuadros-Espinoza
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France.,Sorbonne Université, Collège doctoral, Paris, France
| | - Olivier Cassar
- Oncogenic Virus Epidemiology and Pathophysiology, Institut Pasteur, UMR 3569, CNRS, Paris, France
| | - Maximilian Larena
- Human Evolution, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Albert Min-Shan Ko
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China
| | - Christine Harmant
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France
| | - Romain Laurent
- Muséum National d'Histoire Naturelle, UMR7206, CNRS, Université de Paris, Paris, France
| | - Paul Verdu
- Muséum National d'Histoire Naturelle, UMR7206, CNRS, Université de Paris, Paris, France
| | - Guillaume Laval
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France
| | - Anne Boland
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - Robert Olaso
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - Frédérique Valentin
- Maison de l'Archéologie et de l'Ethnologie, UMR 7041, CNRS, Nanterre, France
| | - Ying-Chin Ko
- Environment-Omics-Disease Research Center, China Medical University and Hospital, Taichung, Taiwan
| | - Mattias Jakobsson
- Human Evolution, Department of Organismal Biology, Uppsala University, Uppsala, Sweden.,Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Antoine Gessain
- Oncogenic Virus Epidemiology and Pathophysiology, Institut Pasteur, UMR 3569, CNRS, Paris, France
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France.
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris, France. .,Collège de France, Paris, France.
| |
Collapse
|
7
|
Yelmen B, Decelle A, Ongaro L, Marnetto D, Tallec C, Montinaro F, Furtlehner C, Pagani L, Jay F. Creating artificial human genomes using generative neural networks. PLoS Genet 2021; 17:e1009303. [PMID: 33539374 PMCID: PMC7861435 DOI: 10.1371/journal.pgen.1009303] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 12/08/2020] [Indexed: 12/13/2022] Open
Abstract
Generative models have shown breakthroughs in a wide spectrum of domains due to recent advancements in machine learning algorithms and increased computational power. Despite these impressive achievements, the ability of generative models to create realistic synthetic data is still under-exploited in genetics and absent from population genetics. Yet a known limitation in the field is the reduced access to many genetic databases due to concerns about violations of individual privacy, although they would provide a rich resource for data mining and integration towards advancing genetic studies. In this study, we demonstrated that deep generative adversarial networks (GANs) and restricted Boltzmann machines (RBMs) can be trained to learn the complex distributions of real genomic datasets and generate novel high-quality artificial genomes (AGs) with none to little privacy loss. We show that our generated AGs replicate characteristics of the source dataset such as allele frequencies, linkage disequilibrium, pairwise haplotype distances and population structure. Moreover, they can also inherit complex features such as signals of selection. To illustrate the promising outcomes of our method, we showed that imputation quality for low frequency alleles can be improved by data augmentation to reference panels with AGs and that the RBM latent space provides a relevant encoding of the data, hence allowing further exploration of the reference dataset and features for solving supervised tasks. Generative models and AGs have the potential to become valuable assets in genetic studies by providing a rich yet compact representation of existing genomes and high-quality, easy-access and anonymous alternatives for private databases.
Collapse
Affiliation(s)
- Burak Yelmen
- Institute of Genomics, University of Tartu, Tartu, Estonia
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
- Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Université Paris-Saclay, Paris, France
| | - Aurélien Decelle
- Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Université Paris-Saclay, Paris, France
- Departamento de Física Téorica I, Universidad Complutense, Madrid, Spain
| | - Linda Ongaro
- Institute of Genomics, University of Tartu, Tartu, Estonia
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - Corentin Tallec
- Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Université Paris-Saclay, Paris, France
| | - Francesco Montinaro
- Institute of Genomics, University of Tartu, Tartu, Estonia
- Department of Biology-Genetics, University of Bari, Bari, Italy
| | - Cyril Furtlehner
- Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Université Paris-Saclay, Paris, France
| | - Luca Pagani
- Institute of Genomics, University of Tartu, Tartu, Estonia
- APE Lab, Department of Biology, University of Padova, Padova, Italy
| | - Flora Jay
- Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Sud, Université Paris-Saclay, Paris, France
| |
Collapse
|
8
|
Hamid I, Korunes KL, Beleza S, Goldberg A. Rapid adaptation to malaria facilitated by admixture in the human population of Cabo Verde. eLife 2021; 10:e63177. [PMID: 33393457 PMCID: PMC7815310 DOI: 10.7554/elife.63177] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 01/04/2021] [Indexed: 12/20/2022] Open
Abstract
Humans have undergone large migrations over the past hundreds to thousands of years, exposing ourselves to new environments and selective pressures. Yet, evidence of ongoing or recent selection in humans is difficult to detect. Many of these migrations also resulted in gene flow between previously separated populations. These recently admixed populations provide unique opportunities to study rapid evolution in humans. Developing methods based on distributions of local ancestry, we demonstrate that this sort of genetic exchange has facilitated detectable adaptation to a malaria parasite in the admixed population of Cabo Verde within the last ~20 generations. We estimate that the selection coefficient is approximately 0.08, one of the highest inferred in humans. Notably, we show that this strong selection at a single locus has likely affected patterns of ancestry genome-wide, potentially biasing demographic inference. Our study provides evidence of adaptation in a human population on historical timescales.
Collapse
Affiliation(s)
- Iman Hamid
- Department of Evolutionary Anthropology, Duke UniversityDurhamUnited States
| | | | - Sandra Beleza
- Department of Genetics and Genome Biology, University of LeicesterLeicesterUnited Kingdom
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke UniversityDurhamUnited States
| |
Collapse
|
9
|
Fortes-Lima C, Verdu P. Anthropological genetics perspectives on the transatlantic slave trade. Hum Mol Genet 2020; 30:R79-R87. [PMID: 33331897 DOI: 10.1093/hmg/ddaa271] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 12/07/2020] [Accepted: 12/11/2020] [Indexed: 01/07/2023] Open
Abstract
During the Trans-Atlantic Slave Trade (TAST), around twelve million Africans were enslaved and forcibly moved from Africa to the Americas and Europe, durably influencing the genetic and cultural landscape of a large part of humanity since the 15th century. Following historians, archaeologists, and anthropologists, population geneticists have, since the 1950's mainly, extensively investigated the genetic diversity of populations on both sides of the Atlantic. These studies shed new lights into the largely unknown genetic origins of numerous enslaved-African descendant communities in the Americas, by inferring their genetic relationships with extant African, European, and Native American populations. Furthermore, exploring genome-wide data with novel statistical and bioinformatics methods, population geneticists have been increasingly able to infer the last 500 years of admixture histories of these populations. These inferences have highlighted the diversity of histories experienced by enslaved-African descendants, and the complex influences of socioeconomic, political, and historical contexts on human genetic diversity patterns during and after the slave trade. Finally, the recent advances of paleogenomics unveiled crucial aspects of the life and health of the first generation of enslaved-Africans in the Americas. Altogether, human population genetics approaches in the genomic and paleogenomic era need to be coupled with history, archaeology, anthropology, and demography in interdisciplinary research, to reconstruct the multifaceted and largely unknown history of the TAST and its influence on human biological and cultural diversities today. Here, we review anthropological genomics studies published over the past 15 years and focusing on the history of enslaved-African descendant populations in the Americas.
Collapse
Affiliation(s)
- Cesar Fortes-Lima
- Sub-department of Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, 75236, Sweden
| | - Paul Verdu
- Unité Mixte de Recherche7206 Eco-Anthropology, CNRS-MNHN-Université de Paris, Musée de l'Homme, Paris, 75016, France
| |
Collapse
|