1
|
Lasky JR, Takou M, Gamba D, Keitt TH. Estimating scale-specific and localized spatial patterns in allele frequency. Genetics 2024; 227:iyae082. [PMID: 38758968 DOI: 10.1093/genetics/iyae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 09/07/2023] [Accepted: 04/28/2024] [Indexed: 05/19/2024] Open
Abstract
Characterizing spatial patterns in allele frequencies is fundamental to evolutionary biology because these patterns contain evidence of underlying processes. However, the spatial scales at which gene flow, changing selection, and drift act are often unknown. Many of these processes can operate inconsistently across space, causing nonstationary patterns. We present a wavelet approach to characterize spatial pattern in allele frequency that helps solve these problems. We show how our approach can characterize spatial patterns in relatedness at multiple spatial scales, i.e. a multilocus wavelet genetic dissimilarity. We also develop wavelet tests of spatial differentiation in allele frequency and quantitative trait loci (QTL). With simulation, we illustrate these methods under different scenarios. We also apply our approach to natural populations of Arabidopsis thaliana to characterize population structure and identify locally adapted loci across scales. We find, for example, that Arabidopsis flowering time QTL show significantly elevated genetic differentiation at 300-1,300 km scales. Wavelet transforms of allele frequencies offer a flexible way to reveal geographic patterns and underlying evolutionary processes.
Collapse
Affiliation(s)
- Jesse R Lasky
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Margarita Takou
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Diana Gamba
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Timothy H Keitt
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
2
|
Groh JS, Coop G. The temporal and genomic scale of selection following hybridization. Proc Natl Acad Sci U S A 2024; 121:e2309168121. [PMID: 38489387 PMCID: PMC10962946 DOI: 10.1073/pnas.2309168121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/30/2024] [Indexed: 03/17/2024] Open
Abstract
Genomic evidence supports an important role for selection in shaping patterns of introgression along the genome, but frameworks for understanding the evolutionary dynamics within hybrid populations that underlie these patterns have been lacking. Due to the clock-like effect of recombination in hybrids breaking up parental haplotypes, drift and selection produce predictable patterns of ancestry variation at varying spatial genomic scales through time. Here, we develop methods based on the Discrete Wavelet Transform to study the genomic scale of local ancestry variation and its association with recombination rates and show that these methods capture temporal dynamics of drift and genome-wide selection after hybridization. We apply these methods to published datasets from hybrid populations of swordtail fish (Xiphophorus) and baboons (Papio) and to inferred Neanderthal introgression in modern humans. Across systems, upward of 20% of variation in local ancestry at the broadest genomic scales can be attributed to systematic selection against introgressed alleles, consistent with strong selection acting on early-generation hybrids. Signatures of selection at fine genomic scales suggest selection over longer time scales; however, we suggest that our ability to confidently infer selection at fine scales is likely limited by inherent biases in current methods for estimating local ancestry from contiguous segments of genomic similarity. Wavelet approaches will become widely applicable as genomic data from systems with introgression become increasingly available and can help shed light on generalities of the genomic consequences of interspecific hybridization.
Collapse
Affiliation(s)
- Jeffrey S. Groh
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, CA95616
| | - Graham Coop
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, CA95616
| |
Collapse
|
3
|
Zhang S, Zhang R, Yuan K, Yang L, Liu C, Liu Y, Ni X, Xu S. Reconstructing complex admixture history using a hierarchical model. Brief Bioinform 2024; 25:bbad540. [PMID: 38261339 PMCID: PMC10805183 DOI: 10.1093/bib/bbad540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/04/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024] Open
Abstract
Various methods have been proposed to reconstruct admixture histories by analyzing the length of ancestral chromosomal tracts, such as estimating the admixture time and number of admixture events. However, available methods do not explicitly consider the complex admixture structure, which characterizes the joining and mixing patterns of different ancestral populations during the admixture process, and instead assume a simplified one-by-one sequential admixture model. In this study, we proposed a novel approach that considers the non-sequential admixture structure to reconstruct admixture histories. Specifically, we introduced a hierarchical admixture model that incorporated four ancestral populations and developed a new method, called HierarchyMix, which uses the length of ancestral tracts and the number of ancestry switches along genomes to reconstruct the four-way admixture history. By automatically selecting the optimal admixture model using the Bayesian information criterion principles, HierarchyMix effectively estimates the corresponding admixture parameters. Simulation studies confirmed the effectiveness and robustness of HierarchyMix. We also applied HierarchyMix to Uyghurs and Kazakhs, enabling us to reconstruct the admixture histories of Central Asians. Our results highlight the importance of considering complex admixture structures and demonstrate that HierarchyMix is a useful tool for analyzing complex admixture events.
Collapse
Affiliation(s)
- Shi Zhang
- School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, China
| | - Rui Zhang
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Kai Yuan
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Lu Yang
- School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, China
| | - Chang Liu
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yuting Liu
- School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, China
| | - Xumin Ni
- School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Zhangjiang Fudan International Innovation Center, Center for Evolutionary Biology, School of Life Sciences, Department of Liver Surgery and Transplantation Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032 , China
- Ministry of Education Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai 201203, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
4
|
Oliveira S, Fehn AM, Amorim B, Stoneking M, Rocha J. Genome-wide variation in the Angolan Namib Desert reveals unique pre-Bantu ancestry. SCIENCE ADVANCES 2023; 9:eadh3822. [PMID: 37738339 PMCID: PMC10516492 DOI: 10.1126/sciadv.adh3822] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/18/2023] [Indexed: 09/24/2023]
Abstract
Ancient DNA studies reveal the genetic structure of Africa before the expansion of Bantu-speaking agriculturalists; however, the impact of now extinct hunter-gatherer and herder societies on the genetic makeup of present-day African groups remains elusive. Here, we uncover the genetic legacy of pre-Bantu populations from the Angolan Namib Desert, where we located small-scale groups associated with enigmatic forager traditions, as well as the last speakers of the Khoe-Kwadi family's Kwadi branch. By applying an ancestry decomposition approach to genome-wide data from these and other African populations, we reconstructed the fine-scale histories of contact emerging from the migration of Khoe-Kwadi-speaking pastoralists and identified a deeply divergent ancestry, which is exclusively shared between groups from the Angolan Namib and adjacent areas of Namibia. The unique genetic heritage of the Namib peoples shows how modern DNA research targeting understudied regions of high ethnolinguistic diversity can complement ancient DNA studies in probing the deep genetic structure of the African continent.
Collapse
Affiliation(s)
- Sandra Oliveira
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
- Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | - Anne-Maria Fehn
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
- Biopolis Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
| | - Beatriz Amorim
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
- Biopolis Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4099-002 Porto, Portugal
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
- Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive, UMR 5558 Villeurbanne, France
| | - Jorge Rocha
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
- Biopolis Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661 Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4099-002 Porto, Portugal
| |
Collapse
|
5
|
Moorjani P, Hellenthal G. Methods for Assessing Population Relationships and History Using Genomic Data. Annu Rev Genomics Hum Genet 2023; 24:305-332. [PMID: 37220313 PMCID: PMC11040641 DOI: 10.1146/annurev-genom-111422-025117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Genetic data contain a record of our evolutionary history. The availability of large-scale datasets of human populations from various geographic areas and timescales, coupled with advances in the computational methods to analyze these data, has transformed our ability to use genetic data to learn about our evolutionary past. Here, we review some of the widely used statistical methods to explore and characterize population relationships and history using genomic data. We describe the intuition behind commonly used approaches, their interpretation, and important limitations. For illustration, we apply some of these techniques to genome-wide autosomal data from 929 individuals representing 53 worldwide populations that are part of the Human Genome Diversity Project. Finally, we discuss the new frontiers in genomic methods to learn about population history. In sum, this review highlights the power (and limitations) of DNA to infer features of human evolutionary history, complementing the knowledge gleaned from other disciplines, such as archaeology, anthropology, and linguistics.
Collapse
Affiliation(s)
- Priya Moorjani
- Department of Molecular and Cell Biology and Center for Computational Biology, University of California, Berkeley, California, USA;
| | - Garrett Hellenthal
- UCL Genetics Institute and Research Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom;
| |
Collapse
|
6
|
Tan T, Atkinson EG. Strategies for the Genomic Analysis of Admixed Populations. Annu Rev Biomed Data Sci 2023; 6:105-127. [PMID: 37127050 PMCID: PMC10871708 DOI: 10.1146/annurev-biodatasci-020722-014310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Admixed populations constitute a large portion of global human genetic diversity, yet they are often left out of genomics analyses. This exclusion is problematic, as it leads to disparities in the understanding of the genetic structure and history of diverse cohorts and the performance of genomic medicine across populations. Admixed populations have particular statistical challenges, as they inherit genomic segments from multiple source populations-the primary reason they have historically been excluded from genetic studies. In recent years, however, an increasing number of statistical methods and software tools have been developed to account for and leverage admixture in the context of genomics analyses. Here, we provide a survey of such computational strategies for the informed consideration of admixture to allow for the well-calibrated inclusion of mixed ancestry populations in large-scale genomics studies, and we detail persisting gaps in existing tools.
Collapse
Affiliation(s)
- Taotao Tan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA;
| | - Elizabeth G Atkinson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA;
| |
Collapse
|
7
|
Groh J, Coop G. The temporal and genomic scale of selection following hybridization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.25.542345. [PMID: 37337589 PMCID: PMC10276902 DOI: 10.1101/2023.05.25.542345] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Genomic evidence supports an important role for selection in shaping patterns of introgression along the genome, but frameworks for understanding the dynamics underlying these patterns within hybrid populations have been lacking. Here, we develop methods based on the Wavelet Transform to understand the spatial genomic scale of local ancestry variation and its association with recombination rates. We present theory and use simulations to show how wavelet-based decompositions of ancestry variance along the genome and the correlation between ancestry and recombination reflect the joint effects of recombination, genetic drift, and genome-wide selection against introgressed alleles. Due to the clock-like effect of recombination in hybrids breaking up parental haplotypes, drift and selection produce predictable patterns of local ancestry variation at varying spatial genomic scales through time. Using wavelet approaches to identify the genomic scale of variance in ancestry and its correlates, we show that these methods can detect temporally localized effects of drift and selection. We apply these methods to previously published datasets from hybrid populations of swordtail fish (Xiphophorus) and baboons (Papio), and to inferred Neanderthal introgression in modern humans. Across systems, we find that upwards of 20% of the variation in local ancestry at the broadest genomic scales can be attributed to systematic selection against introgressed alleles, consistent with strong selection acting on early-generation hybrids. We also see signals of selection at fine genomic scales and much longer time scales. However, we show that our ability to confidently infer selection at fine scales is likely limited by inherent biases in current methods for estimating local ancestry from genomic similarity. Wavelet approaches will become widely applicable as genomic data from systems with introgression become increasingly available, and can help shed light on generalities of the genomic consequences of interspecific hybridization.
Collapse
Affiliation(s)
- Jeffrey Groh
- Department of Evolution and Ecology, and Center for Population Biology, University of California, Davis, CA 95616
| | - Graham Coop
- Department of Evolution and Ecology, and Center for Population Biology, University of California, Davis, CA 95616
| |
Collapse
|
8
|
Wang MS, Murray GGR, Mann D, Groves P, Vershinina AO, Supple MA, Kapp JD, Corbett-Detig R, Crump SE, Stirling I, Laidre KL, Kunz M, Dalén L, Green RE, Shapiro B. A polar bear paleogenome reveals extensive ancient gene flow from polar bears into brown bears. Nat Ecol Evol 2022; 6:936-944. [PMID: 35711062 DOI: 10.1038/s41559-022-01753-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 03/30/2022] [Indexed: 11/09/2022]
Abstract
Polar bears (Ursus maritimus) and brown bears (Ursus arctos) are sister species possessing distinct physiological and behavioural adaptations that evolved over the last 500,000 years. However, comparative and population genomics analyses have revealed that several extant and extinct brown bear populations have relatively recent polar bear ancestry, probably as the result of geographically localized instances of gene flow from polar bears into brown bears. Here, we generate and analyse an approximate 20X paleogenome from an approximately 100,000-year-old polar bear that reveals a massive prehistoric admixture event, which is evident in the genomes of all living brown bears. This ancient admixture event was not visible from genomic data derived from living polar bears. Like more recent events, this massive admixture event mainly involved unidirectional gene flow from polar bears into brown bears and occurred as climate changes caused overlap in the ranges of the two species. These findings highlight the complex reticulate paths that evolution can take within a regime of radically shifting climate.
Collapse
Affiliation(s)
- Ming-Shan Wang
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA, USA.,Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Gemma G R Murray
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Daniel Mann
- Department of Geosciences, University of Alaska, Fairbanks, AK, USA.,Institute of Arctic Biology, University of Alaska, Fairbanks, AK, USA
| | - Pamela Groves
- Institute of Arctic Biology, University of Alaska, Fairbanks, AK, USA
| | - Alisa O Vershinina
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Megan A Supple
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA, USA.,Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Joshua D Kapp
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sarah E Crump
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ian Stirling
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada.,Wildlife Research Division, Environment and Climate Change Canada Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Kristin L Laidre
- Polar Science Center, Applied Physics Laboratory, University of Washington, Seattle, WA, USA
| | - Michael Kunz
- University of Alaska Museum of the North, Fairbanks, AK, USA
| | - Love Dalén
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden.,Centre for Palaeogenetics, Stockholm, Sweden
| | - Richard E Green
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Beth Shapiro
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA, USA. .,Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
9
|
Xiong T, Li X, Yago M, Mallet J. Admixture of evolutionary rates across a butterfly hybrid zone. eLife 2022; 11:e78135. [PMID: 35703474 PMCID: PMC9246367 DOI: 10.7554/elife.78135] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/14/2022] [Indexed: 12/26/2022] Open
Abstract
Hybridization is a major evolutionary force that can erode genetic differentiation between species, whereas reproductive isolation maintains such differentiation. In studying a hybrid zone between the swallowtail butterflies Papilio syfanius and Papilio maackii (Lepidoptera: Papilionidae), we made the unexpected discovery that genomic substitution rates are unequal between the parental species. This phenomenon creates a novel process in hybridization, where genomic regions most affected by gene flow evolve at similar rates between species, while genomic regions with strong reproductive isolation evolve at species-specific rates. Thus, hybridization mixes evolutionary rates in a way similar to its effect on genetic ancestry. Using coalescent theory, we show that the rate-mixing process provides distinct information about levels of gene flow across different parts of genomes, and the degree of rate-mixing can be predicted quantitatively from relative sequence divergence ([Formula: see text]) between the hybridizing species at equilibrium. Overall, we demonstrate that reproductive isolation maintains not only genomic differentiation, but also the rate at which differentiation accumulates. Thus, asymmetric rates of evolution provide an additional signature of loci involved in reproductive isolation.
Collapse
Affiliation(s)
- Tianzhu Xiong
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Xueyan Li
- Kunming Institute of Zoology, Chinese Academy of SciencesKunmingChina
| | - Masaya Yago
- The University Museum, The University of TokyoTokyoJapan
| | - James Mallet
- Department of Organismic and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| |
Collapse
|
10
|
Gopalan S, Smith SP, Korunes K, Hamid I, Ramachandran S, Goldberg A. Human genetic admixture through the lens of population genomics. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200410. [PMID: 35430881 PMCID: PMC9014191 DOI: 10.1098/rstb.2020.0410] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Over the past 50 years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, ‘The Apportionment of Human Diversity’. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene flow between two or more previously separated source populations to form a new admixed population. The admixture process introduces ancestry-based structure into patterns of genetic variation within and between populations, which in turn influences the inference of demographic histories, identification of genetic targets of selection and prediction of complex traits. In this review, we outline some challenges for admixture population genetics, including limitations of applying methods designed for populations without recent admixture to the study of admixed populations. We highlight recent studies and methodological advances that aim to overcome such challenges, leveraging genomic signatures of admixture that occurred in the past tens of generations to gain insights into human history, natural selection and complex trait architecture. This article is part of the theme issue ‘Celebrating 50 years since Lewontin's apportionment of human diversity’.
Collapse
Affiliation(s)
- Shyamalika Gopalan
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Katharine Korunes
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Iman Hamid
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
- Data Science Initiative, Brown University, Providence, RI 02912, USA
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
11
|
Overlapping haplotype blocks indicate shared genomic regions between a composite beef cattle breed and its founder breeds. Livest Sci 2021. [DOI: 10.1016/j.livsci.2021.104747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
12
|
Iasi LNM, Ringbauer H, Peter BM. An Extended Admixture Pulse Model Reveals the Limitations to Human-Neandertal Introgression Dating. Mol Biol Evol 2021; 38:5156-5174. [PMID: 34254144 PMCID: PMC8557420 DOI: 10.1093/molbev/msab210] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Neandertal DNA makes up 2-3% of the genomes of all non-African individuals. The patterns of Neandertal ancestry in modern humans have been used to estimate that this is the result of gene flow that occurred during the expansion of modern humans into Eurasia, but the precise dates of this event remain largely unknown. Here, we introduce an extended admixture pulse model that allows joint estimation of the timing and duration of gene flow. This model leads to simple expressions for both the admixture segment distribution and the decay curve of ancestry linkage disequilibrium, and we show that these two statistics are closely related. In simulations, we find that estimates of the mean time of admixture are largely robust to details in gene flow models, but that the duration of the gene flow can only be recovered if gene flow is very recent and the exact recombination map is known. These results imply that gene flow from Neandertals into modern humans could have happened over hundreds of generations. Ancient genomes from the time around the admixture event are thus likely required to resolve the question when, where, and for how long humans and Neandertals interacted.
Collapse
Affiliation(s)
- Leonardo N M Iasi
- Department of Evloutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Benjamin M Peter
- Department of Evloutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| |
Collapse
|
13
|
Isshiki M, Naka I, Kimura R, Nishida N, Furusawa T, Natsuhara K, Yamauchi T, Nakazawa M, Ishida T, Inaoka T, Matsumura Y, Ohtsuka R, Ohashi J. Admixture with indigenous people helps local adaptation: admixture-enabled selection in Polynesians. BMC Ecol Evol 2021; 21:179. [PMID: 34551727 PMCID: PMC8456657 DOI: 10.1186/s12862-021-01900-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 08/25/2021] [Indexed: 01/08/2023] Open
Abstract
Background Homo sapiens have experienced admixture many times in the last few thousand years. To examine how admixture affects local adaptation, we investigated genomes of modern Polynesians, who are shaped through admixture between Austronesian-speaking people from Southeast Asia (Asian-related ancestors) and indigenous people in Near Oceania (Papuan-related ancestors). Methods In this study local ancestry was estimated across the genome in Polynesians (23 Tongan subjects) to find the candidate regions of admixture-enabled selection contributed by Papuan-related ancestors. Results The mean proportion of Papuan-related ancestry across the Polynesian genome was estimated as 24.6% (SD = 8.63%), and two genomic regions, the extended major histocompatibility complex (xMHC) region on chromosome 6 and the ATP-binding cassette transporter sub-family C member 11 (ABCC11) gene on chromosome 16, showed proportions of Papuan-related ancestry more than 5 SD greater than the mean (> 67.8%). The coalescent simulation under the assumption of selective neutrality suggested that such signals of Papuan-related ancestry enrichment were caused by positive selection after admixture (false discovery rate = 0.045). The ABCC11 harbors a nonsynonymous SNP, rs17822931, which affects apocrine secretory cell function. The approximate Bayesian computation indicated that, in Polynesian ancestors, a strong positive selection (s = 0.0217) acted on the ancestral allele of rs17822931 derived from Papuan-related ancestors. Conclusions Our results suggest that admixture with Papuan-related ancestors contributed to the rapid local adaptation of Polynesian ancestors. Considering frequent admixture events in human evolution history, the acceleration of local adaptation through admixture should be a common event in humans. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-021-01900-y.
Collapse
Affiliation(s)
- Mariko Isshiki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Izumi Naka
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Ryosuke Kimura
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0125, Japan
| | - Nao Nishida
- Genome Medical Science Project, Research Center for Hepatitis and Immunology, National Center for Global Health and Medicine, Chiba, 272-8516, Japan
| | - Takuro Furusawa
- Graduate School of Asian and African Area Studies, Kyoto University, Kyoto, 606-8501, Japan
| | - Kazumi Natsuhara
- Department of International Health and Nursing, Faculty of Nursing, Toho University, Tokyo, 143-0015, Japan
| | - Taro Yamauchi
- Faculty of Health Sciences, Hokkaido University, Sapporo, 060-0812, Japan
| | - Minato Nakazawa
- Graduate School of Health Sciences, Kobe University, Kobe, 654-0142, Japan
| | - Takafumi Ishida
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Tsukasa Inaoka
- Department of Human Ecology, Faculty of Agriculture, Saga University, Saga, 840-8502, Japan
| | - Yasuhiro Matsumura
- Faculty of Health and Nutrition, Bunkyo University, Chigasaki, 253-8550, Japan
| | | | - Jun Ohashi
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| |
Collapse
|
14
|
Abstract
Humans reached the Mariana Islands in the western Pacific by ∼3,500 y ago, contemporaneous with or even earlier than the initial peopling of Polynesia. They crossed more than 2,000 km of open ocean to get there, whereas voyages of similar length did not occur anywhere else until more than 2,000 y later. Yet, the settlement of Polynesia has received far more attention than the settlement of the Marianas. There is uncertainty over both the origin of the first colonizers of the Marianas (with different lines of evidence suggesting variously the Philippines, Indonesia, New Guinea, or the Bismarck Archipelago) as well as what, if any, relationship they might have had with the first colonizers of Polynesia. To address these questions, we obtained ancient DNA data from two skeletons from the Ritidian Beach Cave Site in northern Guam, dating to ∼2,200 y ago. Analyses of complete mitochondrial DNA genome sequences and genome-wide SNP data strongly support ancestry from the Philippines, in agreement with some interpretations of the linguistic and archaeological evidence, but in contradiction to results based on computer simulations of sea voyaging. We also find a close link between the ancient Guam skeletons and early Lapita individuals from Vanuatu and Tonga, suggesting that the Marianas and Polynesia were colonized from the same source population, and raising the possibility that the Marianas played a role in the eventual settlement of Polynesia.
Collapse
|
15
|
Cotton JA, Durrant C, Franssen SU, Gelanew T, Hailu A, Mateus D, Sanders MJ, Berriman M, Volf P, Miles MA, Yeo M. Genomic analysis of natural intra-specific hybrids among Ethiopian isolates of Leishmania donovani. PLoS Negl Trop Dis 2020; 14:e0007143. [PMID: 32310945 PMCID: PMC7237039 DOI: 10.1371/journal.pntd.0007143] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 05/19/2020] [Accepted: 12/24/2019] [Indexed: 12/30/2022] Open
Abstract
Parasites of the genus Leishmania (Kinetoplastida: Trypanosomatidae) cause widespread and devastating human diseases. Visceral leishmaniasis due to Leishmania donovani is endemic in Ethiopia where it has also been responsible for major epidemics. The presence of hybrid genotypes has been widely reported in surveys of natural populations, genetic variation reported in a number of Leishmania species, and the extant capacity for genetic exchange demonstrated in laboratory experiments. However, patterns of recombination and the evolutionary history of admixture that produced these hybrid populations remain unclear. Here, we use whole-genome sequence data to investigate Ethiopian L. donovani isolates previously characterized as hybrids by microsatellite and multi-locus sequencing. To date there is only one previous study on a natural population of Leishmania hybrids based on whole-genome sequences. We propose that these hybrids originate from recombination between two different lineages of Ethiopian L. donovani occurring in the same region. Patterns of inheritance are more complex than previously reported with multiple, apparently independent, origins from similar parents that include backcrossing with parental types. Analysis indicates that hybrids are representative of at least three different histories. Furthermore, isolates were highly polysomic at the level of chromosomes with differences between parasites recovered from a recrudescent infection from a previously treated individual. The results demonstrate that recombination is a significant feature of natural populations and contributes to the growing body of data that shows how recombination, and gene flow, shape natural populations of Leishmania.
Collapse
Affiliation(s)
| | | | | | - Tesfaye Gelanew
- Faculty of Medicine, Addis Ababa University, Addis Ababa, Ethiopia
| | - Asrat Hailu
- Faculty of Medicine, Addis Ababa University, Addis Ababa, Ethiopia
| | - David Mateus
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | | | - Petr Volf
- Department of Parasitology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Michael A. Miles
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Matthew Yeo
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
16
|
Next generation sequencing of a set of ancestry-informative SNPs: ancestry assignment of three continental populations and estimating ancestry composition for Mongolians. Mol Genet Genomics 2020; 295:1027-1038. [PMID: 32206883 DOI: 10.1007/s00438-020-01660-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 02/27/2020] [Indexed: 12/31/2022]
Abstract
When traditional short tandem repeat profiling fails to provide valuable information to arrest the criminal, forensic ancestry inference of the biological samples left at the crime scene will probably offer investigative leads and facilitate the investigation process of the case. That is why there are consistent efforts in developing panels for ancestry inference in forensic science. Presently, a 30-plex next generation sequencing-based assay was exploited in this study by assembling well-differentiated single nucleotide polymorphisms for ancestry assignment of unknown individuals from three continental populations (African, European and East Asian). And meanwhile, relatively balanced population-specific differentiation values were maintained to avoid the over-estimation or under-estimation of co-ancestry proportions in individuals with admixed ancestry. The principal component analysis and STRUCTURE analysis of reference populations, test populations and the studied Mongolian group indicated that the novel assay was efficient enough to determine the ancestry origin of an unknown individual from the three continental populations. Besides, ancestry membership proportion estimations for the Mongolian group revealed that a large fraction of the ancestry was contributed by East Asian genetic component (approximately 83.9%), followed by European (approximately 12.6%) and African genetic components (approximately 3.5%), respectively. And next generation sequencing technology applied in this study offers possibility to incorporate more single nucleotide polymorphisms for individual identification and phenotype prediction into the same assay to provide as many as possible investigative clues in the future.
Collapse
|
17
|
Barbieri C, Barquera R, Arias L, Sandoval JR, Acosta O, Zurita C, Aguilar-Campos A, Tito-Álvarez AM, Serrano-Osuna R, Gray RD, Mafessoni F, Heggarty P, Shimizu KK, Fujita R, Stoneking M, Pugach I, Fehren-Schmitz L. The Current Genomic Landscape of Western South America: Andes, Amazonia, and Pacific Coast. Mol Biol Evol 2020; 36:2698-2713. [PMID: 31350885 PMCID: PMC6878948 DOI: 10.1093/molbev/msz174] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Studies of Native South American genetic diversity have helped to shed light on the peopling and differentiation of the continent, but available data are sparse for the major ecogeographic domains. These include the Pacific Coast, a potential early migration route; the Andes, home to the most expansive complex societies and to one of the most widely spoken indigenous language families of the continent (Quechua); and Amazonia, with its understudied population structure and rich cultural diversity. Here, we explore the genetic structure of 176 individuals from these three domains, genotyped with the Affymetrix Human Origins array. We infer multiple sources of ancestry within the Native American ancestry component; one with clear predominance on the Coast and in the Andes, and at least two distinct substrates in neighboring Amazonia, including a previously undetected ancestry characteristic of northern Ecuador and Colombia. Amazonian populations are also involved in recent gene-flow with each other and across ecogeographic domains, which does not accord with the traditional view of small, isolated groups. Long-distance genetic connections between speakers of the same language family suggest that indigenous languages here were spread not by cultural contact alone. Finally, Native American populations admixed with post-Columbian European and African sources at different times, with few cases of prolonged isolation. With our results we emphasize the importance of including understudied regions of the continent in high-resolution genetic studies, and we illustrate the potential of SNP chip arrays for informative regional-scale analysis.
Collapse
Affiliation(s)
- Chiara Barbieri
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany.,Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Rodrigo Barquera
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Leonardo Arias
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - José R Sandoval
- Centro de Investigación de Genética y Biología Molecular (CIGBM), Universidad de San Martín de Porres, Lima, Peru
| | - Oscar Acosta
- Centro de Investigación de Genética y Biología Molecular (CIGBM), Universidad de San Martín de Porres, Lima, Peru
| | - Camilo Zurita
- Cátedra de Inmunología, Facultad de Medicina, Universidad Central del Ecuador, Quito, Ecuador.,Zurita & Zurita Laboratorios, Unidad de Investigaciones en Biomedicina, Quito, Ecuador
| | - Abraham Aguilar-Campos
- Clinical Laboratory, Unidad Médica de Alta Especialidad (UMAE) # 2, Instituto Mexicano del Seguro Social (IMSS), Ciudad Obregón, Sonora, Mexico
| | - Ana M Tito-Álvarez
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Quito, Ecuador
| | - Ricardo Serrano-Osuna
- Clinical Laboratory, Unidad Médica de Alta Especialidad (UMAE) # 2, Instituto Mexicano del Seguro Social (IMSS), Ciudad Obregón, Sonora, Mexico
| | - Russell D Gray
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Fabrizio Mafessoni
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Paul Heggarty
- Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Kentaro K Shimizu
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Ricardo Fujita
- Centro de Investigación de Genética y Biología Molecular (CIGBM), Universidad de San Martín de Porres, Lima, Peru
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Irina Pugach
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Lars Fehren-Schmitz
- UCSC Paleogenomics, Department of Anthropology, University of California, Santa Cruz, CA.,Genomics Institute, University of California, Santa Cruz, CA
| |
Collapse
|
18
|
Saad MN, Mabrouk MS, Eldeib AM, Shaker OG. Studying the effects of haplotype partitioning methods on the RA-associated genomic results from the North American Rheumatoid Arthritis Consortium (NARAC) dataset. J Adv Res 2019; 18:113-126. [PMID: 30891314 PMCID: PMC6403413 DOI: 10.1016/j.jare.2019.01.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 01/03/2019] [Accepted: 01/14/2019] [Indexed: 12/16/2022] Open
Abstract
Haplotype blocks methods plays a complementary role to the single-SNP approaches. CIT, FGT, SSLD, and single-SNP methods should be applied to discover the markers. Selection of the method used for the association has an impact on the biomarkers. SSLD method detected more significant SNPs than CIT, FGT, and single-SNP methods. The 383 SNPs discovered by all methods are significantly associated with RA.
The human genome, which includes thousands of genes, represents a big data challenge. Rheumatoid arthritis (RA) is a complex autoimmune disease with a genetic basis. Many single-nucleotide polymorphism (SNP) association methods partition a genome into haplotype blocks. The aim of this genome wide association study (GWAS) was to select the most appropriate haplotype block partitioning method for the North American Rheumatoid Arthritis Consortium (NARAC) dataset. The methods used for the NARAC dataset were the individual SNP approach and the following haplotype block methods: the four-gamete test (FGT), confidence interval test (CIT), and solid spine of linkage disequilibrium (SSLD). The measured parameters that reflect the strength of the association between the biomarker and RA were the P-value after Bonferroni correction and other parameters used to compare the output of each haplotype block method. This work presents a comparison among the individual SNP approach and the three haplotype block methods to select the method that can detect all the significant SNPs when applied alone. The GWAS results from the NARAC dataset obtained with the different methods are presented. The individual SNP, CIT, FGT, and SSLD methods detected 541, 1516, 1551, and 1831 RA-associated SNPs respectively, and the individual SNP, FGT, CIT, and SSLD methods detected 65, 156, 159, and 450 significant SNPs respectively, that were not detected by the other methods. Three hundred eighty-three SNPs were discovered by the haplotype block methods and the individual SNP approach, while 1021 SNPs were discovered by all three haplotype block methods. The 383 SNPs detected by all the methods are promising candidates for studying RA susceptibility. A hybrid technique involving all four methods should be applied to detect the significant SNPs associated with RA in the NARAC dataset, but the SSLD method may be preferred because of its advantages when only one method was used.
Collapse
Affiliation(s)
- Mohamed N Saad
- Biomedical Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt
| | - Mai S Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology, 6th of October City, Egypt
| | - Ayman M Eldeib
- Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt
| | - Olfat G Shaker
- Medical Biochemistry and Molecular Biology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
19
|
Saad MN, Mabrouk MS, Eldeib AM, Shaker OG. Comparative study for haplotype block partitioning methods - Evidence from chromosome 6 of the North American Rheumatoid Arthritis Consortium (NARAC) dataset. PLoS One 2019; 13:e0209603. [PMID: 30596705 PMCID: PMC6312333 DOI: 10.1371/journal.pone.0209603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/07/2018] [Indexed: 11/19/2022] Open
Abstract
Haplotype-based methods compete with “one-SNP-at-a-time” approaches on being preferred for association studies. Chromosome 6 contains most of the known genetic biomarkers for rheumatoid arthritis (RA) disease. Therefore, chromosome 6 serves as a benchmark for the haplotype methods testing. The aim of this study is to test the North American Rheumatoid Arthritis Consortium (NARAC) dataset to find out if haplotype block methods or single-locus approaches alone can sufficiently provide the significant single nucleotide polymorphisms (SNPs) associated with RA. In addition, could we be satisfied with only one method of the haplotype block methods for partitioning chromosome 6 of the NARAC dataset? In the NARAC dataset, chromosome 6 comprises 35,574 SNPs for 2,062 individuals (868 cases, 1,194 controls). Individual SNP approach and three haplotype block methods were applied to the NARAC dataset to identify the RA biomarkers. We employed three haplotype partitioning methods which are confidence interval test (CIT), four gamete test (FGT), and solid spine of linkage disequilibrium (SSLD). P-values after stringent Bonferroni correction for multiple testing were measured to assess the strength of association between the genetic variants and RA susceptibility. Moreover, the block size (in base pairs (bp) and number of SNPs included), number of blocks, percentage of uncovered SNPs by the block method, percentage of significant blocks from the total number of blocks, number of significant haplotypes and SNPs were used to compare among the three haplotype block methods. Individual SNP, CIT, FGT, and SSLD methods detected 432, 1,086, 1,099, and 1,322 associated SNPs, respectively. Each method identified significant SNPs that were not detected by any other method (Individual SNP: 12, FGT: 37, CIT: 55, and SSLD: 189 SNPs). 916 SNPs were discovered by all the three haplotype block methods. 367 SNPs were discovered by the haplotype block methods and the individual SNP approach. The P-values of these 367 SNPs were lower than those of the SNPs uniquely detected by only one method. The 367 SNPs detected by all the methods represent promising candidates for RA susceptibility. They should be further investigated for the European population. A hybrid technique including the four methods should be applied to detect the significant SNPs associated with RA for chromosome 6 of the NARAC dataset. Moreover, SSLD method may be preferred for its favored benefits in case of selecting only one method.
Collapse
Affiliation(s)
- Mohamed N. Saad
- Biomedical Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt
- * E-mail: ,
| | - Mai S. Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology (MUST), 6th of October City, Egypt
| | - Ayman M. Eldeib
- Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt
| | - Olfat G. Shaker
- Medical Biochemistry and Molecular Biology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
20
|
Ni X, Yuan K, Liu C, Feng Q, Tian L, Ma Z, Xu S. MultiWaver 2.0: modeling discrete and continuous gene flow to reconstruct complex population admixtures. Eur J Hum Genet 2019; 27:133-139. [PMID: 30206356 PMCID: PMC6303267 DOI: 10.1038/s41431-018-0259-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2018] [Revised: 07/12/2018] [Accepted: 08/09/2018] [Indexed: 11/08/2022] Open
Abstract
Our goal in developing the MultiWaver software series was to be able to infer population admixture history under various complex scenarios. The earlier version of MultiWaver considered only discrete admixture models. Here, we report a newly developed version, MultiWaver 2.0, that implements a more flexible framework and is capable of inferring multiple-wave admixture histories under both discrete and continuous admixture models. MultiWaver 2.0 can automatically select an optimal admixture model based on the length distribution of ancestral tracks of chromosomes, and the program can estimate the corresponding parameters under the selected model. Specifically, for discrete admixture models, we used a likelihood ratio test (LRT) to determine the optimal discrete model and an expectation-maximization algorithm to estimate the parameters. In addition, according to the principles of the Bayesian Information Criterion (BIC), we compared the optimal discrete model with several continuous admixture models. In MultiWaver 2.0, we also applied a bootstrapping technique to provide levels of support for the chosen model and the confidence interval (CI) of the estimations of admixture time. Simulation studies validated the reliability and effectiveness of our method. Finally, the program performed well when applied to real datasets of typical admixed populations, such as African Americans, Uyghurs, and Hazaras.
Collapse
Affiliation(s)
- Xumin Ni
- Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing, 100044, China
| | - Kai Yuan
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, CAS, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chang Liu
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, CAS, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qidi Feng
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, CAS, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Lei Tian
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, CAS, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhiming Ma
- Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing, 100044, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Shuhua Xu
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, CAS, Shanghai, 200031, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
- Collaborative Innovation Center of Genetics and Development, Shanghai, 200438, China.
| |
Collapse
|
21
|
Chimusa ER, Defo J, Thami PK, Awany D, Mulisa DD, Allali I, Ghazal H, Moussa A, Mazandu GK. Dating admixture events is unsolved problem in multi-way admixed populations. Brief Bioinform 2018; 21:144-155. [PMID: 30462157 DOI: 10.1093/bib/bby112] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 10/12/2018] [Accepted: 10/15/2018] [Indexed: 12/12/2022] Open
Abstract
Advances in human sequencing technologies, coupled with statistical and computational tools, have fostered the development of methods for dating admixture events. These methods have merits and drawbacks in estimating admixture events in multi-way admixed populations. Here, we first provide a comprehensive review and comparison of current methods pertinent to dating admixture events. Second, we assess various admixture dating tools. We do so by performing various simulations. Third, we apply the top two assessed methods to real data of a uniquely admixed population from South Africa. Results reveal that current dating admixture models are not sufficiently equipped to estimate ancient admixtures events and to identify multi-faceted admixture events in complex multi-way admixed populations. We conclude with a discussion of research areas where further work on dating admixture-based methods is needed.
Collapse
Affiliation(s)
- Emile R Chimusa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Joel Defo
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Prisca K Thami
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana.,Department of Biological Sciences, University of Botswana, Gaborone, Botswana
| | - Denis Awany
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Delesa D Mulisa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Imane Allali
- Division of Computational Biology, Department of Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | | | - Ahmed Moussa
- Abdelmalek Essaadi University ENSA, Tangier, Morocco
| | - Gaston K Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,Division of Computational Biology, Department of Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,African Institute for Mathematical Sciences (AIMS),Muizenberg, Cape Town, South Africa
| |
Collapse
|
22
|
Ni X, Yuan K, Yang X, Feng Q, Guo W, Ma Z, Xu S. Inference of multiple-wave admixtures by length distribution of ancestral tracks. Heredity (Edinb) 2018; 121:52-63. [PMID: 29358727 PMCID: PMC5997750 DOI: 10.1038/s41437-017-0041-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 11/23/2017] [Accepted: 11/24/2017] [Indexed: 12/31/2022] Open
Abstract
The ancestral tracks in admixed genomes are valuable for population history inference. While a few methods have been developed to infer admixture history based on ancestral tracks, these methods suffer the same flaw: only population admixture history under some specific models can be inferred. In addition, the inference of history might be biased or even unreliable if the specific model deviates from the real situation. To address this problem, we firstly proposed a general discrete admixture model to describe the admixture history with multiple ancestral populations and multiple-wave admixtures. We next deduced the length distribution of ancestral tracks under the general discrete admixture model. We further developed a new method, MultiWaver, to explore multiple-wave admixture histories. Our method could automatically determine an optimal admixture model based on the length distribution of ancestral tracks, and estimate the corresponding parameters under this optimal model. Specifically, we used a likelihood ratio test (LRT) to determine the number of admixture waves, and implemented an expectation-maximization (EM) algorithm to estimate parameters. We used simulation studies to validate the reliability and effectiveness of our method. Finally, good performance was observed when our method was applied to real data sets of African Americans and Mexicans, and new insights were gained into the admixture history of Uyghurs and Hazaras.
Collapse
Affiliation(s)
- Xumin Ni
- Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing, China
| | - Kai Yuan
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, CAS, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiong Yang
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, CAS, Shanghai, China
| | - Qidi Feng
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, CAS, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Guo
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Zhiming Ma
- Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing, China.
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
| | - Shuhua Xu
- Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, CAS, Shanghai, China.
- University of Chinese Academy of Sciences, Beijing, China.
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
- Collaborative Innovation Center of Genetics and Development, Shanghai, China.
| |
Collapse
|
23
|
Pugach I, Duggan AT, Merriwether DA, Friedlaender FR, Friedlaender JS, Stoneking M. The Gateway from Near into Remote Oceania: New Insights from Genome-Wide Data. Mol Biol Evol 2018; 35:871-886. [PMID: 29301001 PMCID: PMC5889034 DOI: 10.1093/molbev/msx333] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
A widely accepted two-wave scenario of human settlement of Oceania involves the first out-of-Africa migration circa 50,000 years ago (ya), and the more recent Austronesian expansion, which reached the Bismarck Archipelago by 3,450 ya. Whereas earlier genetic studies provided evidence for extensive sex-biased admixture between the incoming and the indigenous populations, some archaeological, linguistic, and genetic evidence indicates a more complicated picture of settlement. To study regional variation in Oceania in more detail, we have compiled a genome-wide data set of 823 individuals from 72 populations (including 50 populations from Oceania) and over 620,000 autosomal single nucleotide polymorphisms (SNPs). We show that the initial dispersal of people from the Bismarck Archipelago into Remote Oceania occurred in a "leapfrog" fashion, completely by-passing the main chain of the Solomon Islands, and that the colonization of the Solomon Islands proceeded in a bidirectional manner. Our results also support a divergence between western and eastern Solomons, in agreement with the sharp linguistic divide known as the Tryon-Hackman line. We also report substantial post-Austronesian gene flow across the Solomons. In particular, Santa Cruz (in Remote Oceania) exhibits extraordinarily high levels of Papuan ancestry that cannot be explained by a simple bottleneck/founder event scenario. Finally, we use simulations to show that discrepancies between different methods for dating admixture likely reflect different sensitivities of the methods to multiple admixture events from the same (or similar) sources. Overall, this study points to the importance of fine-scale sampling to understand the complexities of human population history.
Collapse
Affiliation(s)
- Irina Pugach
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Ana T Duggan
- Department of Anthropology, McMaster University, Hamilton, Canada
| | | | | | | | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| |
Collapse
|
24
|
Investigating the origins of eastern Polynesians using genome-wide data from the Leeward Society Isles. Sci Rep 2018; 8:1823. [PMID: 29379068 PMCID: PMC5789021 DOI: 10.1038/s41598-018-20026-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 01/11/2018] [Indexed: 12/14/2022] Open
Abstract
The debate concerning the origin of the Polynesian speaking peoples has been recently reinvigorated by genetic evidence for secondary migrations to western Polynesia from the New Guinea region during the 2nd millennium BP. Using genome-wide autosomal data from the Leeward Society Islands, the ancient cultural hub of eastern Polynesia, we find that the inhabitants' genomes also demonstrate evidence of this episode of admixture, dating to 1,700-1,200 BP. This supports a late settlement chronology for eastern Polynesia, commencing ~1,000 BP, after the internal differentiation of Polynesian society. More than 70% of the autosomal ancestry of Leeward Society Islanders derives from Island Southeast Asia with the lowland populations of the Philippines as the single largest potential source. These long-distance migrants into Polynesia experienced additional admixture with northern Melanesians prior to the secondary migrations of the 2nd millennium BP. Moreover, the genetic diversity of mtDNA and Y chromosome lineages in the Leeward Society Islands is consistent with linguistic evidence for settlement of eastern Polynesia proceeding from the central northern Polynesian outliers in the Solomon Islands. These results stress the complex demographic history of the Leeward Society Islands and challenge phylogenetic models of cultural evolution predicated on eastern Polynesia being settled from Samoa.
Collapse
|
25
|
|
26
|
Xue J, Lencz T, Darvasi A, Pe’er I, Carmi S. The time and place of European admixture in Ashkenazi Jewish history. PLoS Genet 2017; 13:e1006644. [PMID: 28376121 PMCID: PMC5380316 DOI: 10.1371/journal.pgen.1006644] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 02/18/2017] [Indexed: 12/21/2022] Open
Abstract
The Ashkenazi Jewish (AJ) population is important in genetics due to its high rate of Mendelian disorders. AJ appeared in Europe in the 10th century, and their ancestry is thought to comprise European (EU) and Middle-Eastern (ME) components. However, both the time and place of admixture are subject to debate. Here, we attempt to characterize the AJ admixture history using a careful application of new and existing methods on a large AJ sample. Our main approach was based on local ancestry inference, in which we first classified each AJ genomic segment as EU or ME, and then compared allele frequencies along the EU segments to those of different EU populations. The contribution of each EU source was also estimated using GLOBETROTTER and haplotype sharing. The time of admixture was inferred based on multiple statistics, including ME segment lengths, the total EU ancestry per chromosome, and the correlation of ancestries along the chromosome. The major source of EU ancestry in AJ was found to be Southern Europe (≈60–80% of EU ancestry), with the rest being likely Eastern European. The inferred admixture time was ≈30 generations ago, but multiple lines of evidence suggest that it represents an average over two or more events, pre- and post-dating the founder event experienced by AJ in late medieval times. The time of the pre-bottleneck admixture event, which was likely Southern European, was estimated to ≈25–50 generations ago. The Ashkenazi Jewish population has resided in Europe for much of its 1000-year existence. However, its ethnic and geographic origins are controversial, due to the scarcity of reliable historical records. Previous genetic studies have found links to Middle-Eastern and European ancestries, but the admixture history has not been studied in detail yet, partly due to technical difficulties in disentangling signals from multiple admixture events. Here, we present an in-depth analysis of the sources of European gene flow and the time of admixture events by using multiple new and existing methods and extensive simulations. Our results suggest a model of at least two events of European admixture. One event slightly pre-dated a late medieval founder event and was likely from a Southern European source. Another event post-dated the founder event and likely occurred in Eastern Europe. These results, as well as the methods introduced, will be highly valuable for geneticists and other researchers interested in Ashkenazi Jewish origins.
Collapse
Affiliation(s)
- James Xue
- Department of Computer Science, Columbia University, New York, New York, United States of America
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Todd Lencz
- Center for Psychiatric Neuroscience, The Feinstein Institute for Medical Research, North Shore-Long Island Jewish Health System, Manhasset, New York, United States of America
- Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of the North Shore–Long Island Jewish Health System, Glen Oaks, New York, United States of America
- Departments of Psychiatry and Molecular Medicine, Hofstra Northwell School of Medicine, Hempstead, New York, United States of America
| | - Ariel Darvasi
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Itsik Pe’er
- Department of Computer Science, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Ein Kerem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
27
|
Chen F, Dow M, Ding S, Lu Y, Jiang X, Tang H, Wang S. PREMIX: PRivacy-preserving EstiMation of Individual admiXture. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2016:1747-1755. [PMID: 28269933 PMCID: PMC5333197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper we proposed a framework: PRivacy-preserving EstiMation of Individual admiXture (PREMIX) using Intel software guard extensions (SGX). SGX is a suite of software and hardware architectures to enable efficient and secure computation over confidential data. PREMIX enables multiple sites to securely collaborate on estimating individual admixture within a secure enclave inside Intel SGX. We implemented a feature selection module to identify most discriminative Single Nucleotide Polymorphism (SNP) based on informativeness and an Expectation Maximization (EM)-based Maximum Likelihood estimator to identify the individual admixture. Experimental results based on both simulation and 1000 genome data demonstrated the efficiency and accuracy of the proposed framework. PREMIX ensures a high level of security as all operations on sensitive genomic data are conducted within a secure enclave using SGX.
Collapse
Affiliation(s)
- Feng Chen
- Department of Biomedical Informatics, UC San Diego, La Jolla, CA
| | - Michelle Dow
- Department of Biomedical Informatics, UC San Diego, La Jolla, CA
| | - Sijie Ding
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, CA
| | - Yao Lu
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, CA
| | - Xiaoqian Jiang
- Department of Biomedical Informatics, UC San Diego, La Jolla, CA
| | - Hua Tang
- Department of Genetics, Stanford University, Stanford, CA
| | - Shuang Wang
- Department of Biomedical Informatics, UC San Diego, La Jolla, CA
| |
Collapse
|
28
|
Genomic insights into the peopling of the Southwest Pacific. Nature 2016; 538:510-513. [PMID: 27698418 PMCID: PMC5515717 DOI: 10.1038/nature19844] [Citation(s) in RCA: 135] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 09/13/2016] [Indexed: 12/19/2022]
Abstract
The appearance of people associated with the Lapita culture in the South Pacific around 3,000 years ago marked the beginning of the last major human dispersal to unpopulated lands. However, the relationship of these pioneers to the long-established Papuan people of the New Guinea region is unclear. Here we present genome-wide ancient DNA data from three individuals from Vanuatu (about 3,100-2,700 years before present) and one from Tonga (about 2,700-2,300 years before present), and analyse them with data from 778 present-day East Asians and Oceanians. Today, indigenous people of the South Pacific harbour a mixture of ancestry from Papuans and a population of East Asian origin that no longer exists in unmixed form, but is a match to the ancient individuals. Most analyses have interpreted the minimum of twenty-five per cent Papuan ancestry in the region today as evidence that the first humans to reach Remote Oceania, including Polynesia, were derived from population mixtures near New Guinea, before their further expansion into Remote Oceania. However, our finding that the ancient individuals had little to no Papuan ancestry implies that later human population movements spread Papuan ancestry through the South Pacific after the first peopling of the islands.
Collapse
|
29
|
New Software for the Fast Estimation of Population Recombination Rates (FastEPRR) in the Genomic Era. G3-GENES GENOMES GENETICS 2016; 6:1563-71. [PMID: 27172192 PMCID: PMC4889653 DOI: 10.1534/g3.116.028233] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Genetic recombination is a very important evolutionary mechanism that mixes parental haplotypes and produces new raw material for organismal evolution. As a result, information on recombination rates is critical for biological research. In this paper, we introduce a new extremely fast open-source software package (FastEPRR) that uses machine learning to estimate recombination rate ρ (=4Ner) from intraspecific DNA polymorphism data. When ρ>10 and the number of sampled diploid individuals is large enough (≥50), the variance of ρFastEPRR remains slightly smaller than that of ρLDhat. The new estimate ρcomb (calculated by averaging ρFastEPRR and ρLDhat) has the smallest variance of all cases. When estimating ρFastEPRR, the finite-site model was employed to analyze cases with a high rate of recurrent mutations, and an additional method is proposed to consider the effect of variable recombination rates within windows. Simulations encompassing a wide range of parameters demonstrate that different evolutionary factors, such as demography and selection, may not increase the false positive rate of recombination hotspots. Overall, accuracy of FastEPRR is similar to the well-known method, LDhat, but requires far less computation time. Genetic maps for each human population (YRI, CEU, and CHB) extracted from the 1000 Genomes OMNI data set were obtained in less than 3 d using just a single CPU core. The Pearson Pairwise correlation coefficient between the ρFastEPRR and ρLDhat maps is very high, ranging between 0.929 and 0.987 at a 5-Mb scale. Considering that sample sizes for these kinds of data are increasing dramatically with advances in next-generation sequencing technologies, FastEPRR (freely available at http://www.picb.ac.cn/evolgen/) is expected to become a widely used tool for establishing genetic maps and studying recombination hotspots in the population genomic era.
Collapse
|
30
|
Pugach I, Matveev R, Spitsyn V, Makarov S, Novgorodov I, Osakovsky V, Stoneking M, Pakendorf B. The Complex Admixture History and Recent Southern Origins of Siberian Populations. Mol Biol Evol 2016; 33:1777-95. [PMID: 26993256 PMCID: PMC4915357 DOI: 10.1093/molbev/msw055] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Although Siberia was inhabited by modern humans at an early stage, there is still debate over whether it remained habitable during the extreme cold of the Last Glacial Maximum or whether it was subsequently repopulated by peoples with recent shared ancestry. Previous studies of the genetic history of Siberian populations were hampered by the extensive admixture that appears to have taken place among these populations, because commonly used methods assume a tree-like population history and at most single admixture events. Here we analyze geogenetic maps and use other approaches to distinguish the effects of shared ancestry from prehistoric migrations and contact, and develop a new method based on the covariance of ancestry components, to investigate the potentially complex admixture history. We furthermore adapt a previously devised method of admixture dating for use with multiple events of gene flow, and apply these methods to whole-genome genotype data from over 500 individuals belonging to 20 different Siberian ethnolinguistic groups. The results of these analyses indicate that there have been multiple layers of admixture detectable in most of the Siberian populations, with considerable differences in the admixture histories of individual populations. Furthermore, most of the populations of Siberia included here, even those settled far to the north, appear to have a southern origin, with the northward expansions of different populations possibly being driven partly by the advent of pastoralism, especially reindeer domestication. These newly developed methods to analyze multiple admixture events should aid in the investigation of similarly complex population histories elsewhere.
Collapse
Affiliation(s)
- Irina Pugach
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Rostislav Matveev
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
| | - Viktor Spitsyn
- Research Centre for Medical Genetics, Federal State Budgetary Institution, Moscow, Russian Federation
| | - Sergey Makarov
- Research Centre for Medical Genetics, Federal State Budgetary Institution, Moscow, Russian Federation
| | - Innokentiy Novgorodov
- Institute of Foreign Philology and Regional Studies, North-Eastern Federal University, Yakutsk, Russian Federation
| | - Vladimir Osakovsky
- Institute of Health, North-Eastern Federal University, Yakutsk, Russian Federation
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Brigitte Pakendorf
- Laboratoire Dynamique du Langage, UMR5596, CNRS and Université Lyon Lumière 2, Lyon, France
| |
Collapse
|
31
|
Length Distribution of Ancestral Tracks under a General Admixture Model and Its Applications in Population History Inference. Sci Rep 2016; 6:20048. [PMID: 26818889 PMCID: PMC4730239 DOI: 10.1038/srep20048] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 12/23/2015] [Indexed: 11/08/2022] Open
Abstract
The length of ancestral tracks decays with the passing of generations which can be used to infer population admixture histories. Previous studies have shown the power in recovering the histories of admixed populations via the length distributions of ancestral tracks even under simple models. We believe that the deduction of length distributions under a general model will greatly elevate the power. Here we first deduced the length distributions under a general model and proposed general principles in parameter estimation and model selection with the deduced length distributions. Next, we focused on studying the length distributions and its applications under three typical special cases. Extensive simulations showed that the length distributions of ancestral tracks were well predicted by our theoretical framework. We further developed a new method, AdmixInfer, based on the length distributions and good performance was observed when it was applied to infer population histories under the three typical models. Notably, our method was insensitive to demographic history, sample size and threshold to discard short tracks. Finally, good performance was also observed when applied to some real datasets of African Americans, Mexicans and South Asian populations from the HapMap project and the Human Genome Diversity Project.
Collapse
|
32
|
Early Lapita skeletons from Vanuatu show Polynesian craniofacial shape: Implications for Remote Oceanic settlement and Lapita origins. Proc Natl Acad Sci U S A 2015; 113:292-7. [PMID: 26712019 DOI: 10.1073/pnas.1516186113] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
With a cultural and linguistic origin in Island Southeast Asia the Lapita expansion is thought to have led ultimately to the Polynesian settlement of the east Polynesian region after a time of mixing/integration in north Melanesia and a nearly 2,000-y pause in West Polynesia. One of the major achievements of recent Lapita research in Vanuatu has been the discovery of the oldest cemetery found so far in the Pacific at Teouma on the south coast of Efate Island, opening up new prospects for the biological definition of the early settlers of the archipelago and of Remote Oceania in general. Using craniometric evidence from the skeletons in conjunction with archaeological data, we discuss here four debated issues: the Lapita-Asian connection, the degree of admixture, the Lapita-Polynesian connection, and the question of secondary population movement into Remote Oceania.
Collapse
|
33
|
Shriner D. Mixed Ancestry and Disease Risk Transferability. CURRENT GENETIC MEDICINE REPORTS 2015. [DOI: 10.1007/s40142-015-0080-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
34
|
Wollstein A, Lao O. Detecting individual ancestry in the human genome. INVESTIGATIVE GENETICS 2015; 6:7. [PMID: 25937887 PMCID: PMC4416275 DOI: 10.1186/s13323-015-0019-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 01/12/2015] [Indexed: 01/26/2023]
Abstract
Detecting and quantifying the population substructure present in a sample of individuals are of main interest in the fields of genetic epidemiology, population genetics, and forensics among others. To date, several algorithms have been proposed for estimating the amount of genetic ancestry within an individual. In the present review, we introduce the most widely used methods in population genetics for detecting individual genetic ancestry. We further show, by means of simulations, the performance of popular algorithms for detecting individual ancestry in various controlled demographic scenarios. Finally, we provide some hints on how to interpret the results from these algorithms.
Collapse
Affiliation(s)
- Andreas Wollstein
- Department of Forensic Molecular Biology, Erasmus MC University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands ; Section of Evolutionary Biology, Department of Biology II, University of Munich, 82152 Planegg-Martinsried, Germany
| | - Oscar Lao
- Department of Forensic Molecular Biology, Erasmus MC University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands ; Current address: Centro Nacional de Análisis Genómico, Baldiri Reixac, 4, Barcleona Science Park - Tower I, 08028 Barcelona, Spain
| |
Collapse
|
35
|
Yunusbayev B, Metspalu M, Metspalu E, Valeev A, Litvinov S, Valiev R, Akhmetova V, Balanovska E, Balanovsky O, Turdikulova S, Dalimova D, Nymadawa P, Bahmanimehr A, Sahakyan H, Tambets K, Fedorova S, Barashkov N, Khidiyatova I, Mihailov E, Khusainova R, Damba L, Derenko M, Malyarchuk B, Osipova L, Voevoda M, Yepiskoposyan L, Kivisild T, Khusnutdinova E, Villems R. The genetic legacy of the expansion of Turkic-speaking nomads across Eurasia. PLoS Genet 2015; 11:e1005068. [PMID: 25898006 PMCID: PMC4405460 DOI: 10.1371/journal.pgen.1005068] [Citation(s) in RCA: 104] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Accepted: 02/11/2015] [Indexed: 12/28/2022] Open
Abstract
The Turkic peoples represent a diverse collection of ethnic groups defined by the Turkic languages. These groups have dispersed across a vast area, including Siberia, Northwest China, Central Asia, East Europe, the Caucasus, Anatolia, the Middle East, and Afghanistan. The origin and early dispersal history of the Turkic peoples is disputed, with candidates for their ancient homeland ranging from the Transcaspian steppe to Manchuria in Northeast Asia. Previous genetic studies have not identified a clear-cut unifying genetic signal for the Turkic peoples, which lends support for language replacement rather than demic diffusion as the model for the Turkic language’s expansion. We addressed the genetic origin of 373 individuals from 22 Turkic-speaking populations, representing their current geographic range, by analyzing genome-wide high-density genotype data. In agreement with the elite dominance model of language expansion most of the Turkic peoples studied genetically resemble their geographic neighbors. However, western Turkic peoples sampled across West Eurasia shared an excess of long chromosomal tracts that are identical by descent (IBD) with populations from present-day South Siberia and Mongolia (SSM), an area where historians center a series of early Turkic and non-Turkic steppe polities. While SSM matching IBD tracts (> 1cM) are also observed in non-Turkic populations, Turkic peoples demonstrate a higher percentage of such tracts (p-values ≤ 0.01) compared to their non-Turkic neighbors. Finally, we used the ALDER method and inferred admixture dates (~9th–17th centuries) that overlap with the Turkic migrations of the 5th–16th centuries. Thus, our results indicate historical admixture among Turkic peoples, and the recent shared ancestry with modern populations in SSM supports one of the hypothesized homelands for their nomadic Turkic and related Mongolic ancestors. Centuries of nomadic migrations have ultimately resulted in the distribution of Turkic languages over a large area ranging from Siberia, across Central Asia to Eastern Europe and the Middle East. Despite the profound cultural impact left by these nomadic peoples, little is known about their prehistoric origins. Moreover, because contemporary Turkic speakers tend to genetically resemble their geographic neighbors, it is not clear whether their nomadic ancestors left an identifiable genetic trace. In this study, we show that Turkic-speaking peoples sampled across the Middle East, Caucasus, East Europe, and Central Asia share varying proportions of Asian ancestry that originate in a single area, southern Siberia and Mongolia. Mongolic- and Turkic-speaking populations from this area bear an unusually high number of long chromosomal tracts that are identical by descent with Turkic peoples from across west Eurasia. Admixture induced linkage disequilibrium decay across chromosomes in these populations indicates that admixture occurred during the 9th–17th centuries, in agreement with the historically recorded Turkic nomadic migrations and later Mongol expansion. Thus, our findings reveal genetic traces of recent large-scale nomadic migrations and map their source to a previously hypothesized area of Mongolia and southern Siberia.
Collapse
Affiliation(s)
- Bayazit Yunusbayev
- Evolutionary Biology group, Estonian Biocentre, Tartu, Estonia
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
- * E-mail: ,
| | - Mait Metspalu
- Evolutionary Biology group, Estonian Biocentre, Tartu, Estonia
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
- Department of Integrative Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Ene Metspalu
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
| | - Albert Valeev
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
| | - Sergei Litvinov
- Evolutionary Biology group, Estonian Biocentre, Tartu, Estonia
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
| | - Ruslan Valiev
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Bashkortostan, Russia
| | - Vita Akhmetova
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
| | | | - Oleg Balanovsky
- Research Centre for Medical Genetics, RAMS, Moscow, Russia
- Vavilov Institute for General Genetics, RAS, Moscow, Russia
| | - Shahlo Turdikulova
- Laboratory of Genomics, Institute of Bioorganic Chemistry, Academy of Sciences Republic of Uzbekistan, Tashkent, Uzbekistan
| | - Dilbar Dalimova
- Laboratory of Genomics, Institute of Bioorganic Chemistry, Academy of Sciences Republic of Uzbekistan, Tashkent, Uzbekistan
| | | | - Ardeshir Bahmanimehr
- Department of Medical Genetics, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Hovhannes Sahakyan
- Evolutionary Biology group, Estonian Biocentre, Tartu, Estonia
- Laboratory of Ethnogenomics, Institute of Molecular Biology, Academy of Sciences of Armenia, Yerevan, Armenia
| | | | - Sardana Fedorova
- Laboratory of Molecular Genetics, Yakut Research Center of Complex Medical Problems, Yakutsk, Sakha Republic, Russia
- Laboratory of Molecular Biology, North-Eastern Federal University, Yakutsk, Sakha Republic, Russia
| | - Nikolay Barashkov
- Laboratory of Molecular Genetics, Yakut Research Center of Complex Medical Problems, Yakutsk, Sakha Republic, Russia
- Laboratory of Molecular Biology, North-Eastern Federal University, Yakutsk, Sakha Republic, Russia
| | - Irina Khidiyatova
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Bashkortostan, Russia
| | - Evelin Mihailov
- Estonian Genome Center, University of Tartu, Tartu, Estonia
- Gene Technology Workgroup, Estonian Biocentre, Tartu, Estonia
| | - Rita Khusainova
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Bashkortostan, Russia
| | - Larisa Damba
- Institute of Internal Medicine, SB RAMS, Novosibirsk, Russia
| | | | | | - Ludmila Osipova
- Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia
| | - Mikhail Voevoda
- Institute of Internal Medicine, SB RAMS, Novosibirsk, Russia
- Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia
| | - Levon Yepiskoposyan
- Laboratory of Ethnogenomics, Institute of Molecular Biology, Academy of Sciences of Armenia, Yerevan, Armenia
| | - Toomas Kivisild
- Division of Biological Anthropology, University of Cambridge, Cambridge, United Kingdom
| | - Elza Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Research Centre, RAS, Ufa, Bashkortostan, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Bashkortostan, Russia
| | - Richard Villems
- Evolutionary Biology group, Estonian Biocentre, Tartu, Estonia
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
- Estonian Academy of Sciences, Tallinn, Estonia
| |
Collapse
|
36
|
Reconstructing Past Admixture Processes from Local Genomic Ancestry Using Wavelet Transformation. Genetics 2015; 200:469-81. [PMID: 25852078 PMCID: PMC4492373 DOI: 10.1534/genetics.115.176842] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 04/03/2015] [Indexed: 11/18/2022] Open
Abstract
Admixture between long-separated populations is a defining feature of the genomes of many species. The mosaic block structure of admixed genomes can provide information about past contact events, including the time and extent of admixture. Here, we describe an improved wavelet-based technique that better characterizes ancestry block structure from observed genomic patterns. principal components analysis is first applied to genomic data to identify the primary population structure, followed by wavelet decomposition to develop a new characterization of local ancestry information along the chromosomes. For testing purposes, this method is applied to human genome-wide genotype data from Indonesia, as well as virtual genetic data generated using genome-scale sequential coalescent simulations under a wide range of admixture scenarios. Time of admixture is inferred using an approximate Bayesian computation framework, providing robust estimates of both admixture times and their associated levels of uncertainty. Crucially, we demonstrate that this revised wavelet approach, which we have released as the R package adwave, provides improved statistical power over existing wavelet-based techniques and can be used to address a broad range of admixture questions.
Collapse
|
37
|
Pugach I, Stoneking M. Genome-wide insights into the genetic history of human populations. INVESTIGATIVE GENETICS 2015; 6:6. [PMID: 25834724 PMCID: PMC4381409 DOI: 10.1186/s13323-015-0024-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 03/05/2015] [Indexed: 12/21/2022]
Abstract
Although mtDNA and the non-recombining Y chromosome (NRY) studies continue to provide valuable insights into the genetic history of human populations, recent technical, methodological and computational advances and the increasing availability of large-scale, genome-wide data from contemporary human populations around the world promise to reveal new aspects, resolve finer points, and provide a more detailed look at our past demographic history. Genome-wide data are particularly useful for inferring migrations, admixture, and fine structure, as well as for estimating population divergence and admixture times and fluctuations in effective population sizes. In this review, we highlight some of the stories that have emerged from the analyses of genome-wide SNP genotyping data concerning the human history of Southern Africa, India, Oceania, Island South East Asia, Europe and the Americas and comment on possible future study directions. We also discuss advantages and drawbacks of using SNP-arrays, with a particular focus on the ascertainment bias, and ways to circumvent it.
Collapse
Affiliation(s)
- Irina Pugach
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D04103 Leipzig, Germany
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D04103 Leipzig, Germany
| |
Collapse
|
38
|
Fernandes V, Triska P, Pereira JB, Alshamali F, Rito T, Machado A, Fajkošová Z, Cavadas B, Černý V, Soares P, Richards MB, Pereira L. Genetic stratigraphy of key demographic events in Arabia. PLoS One 2015; 10:e0118625. [PMID: 25738654 PMCID: PMC4349752 DOI: 10.1371/journal.pone.0118625] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 01/21/2015] [Indexed: 01/01/2023] Open
Abstract
At the crossroads between Africa and Eurasia, Arabia is necessarily a melting pot, its peoples enriched by successive gene flow over the generations. Estimating the timing and impact of these multiple migrations are important steps in reconstructing the key demographic events in the human history. However, current methods based on genome-wide information identify admixture events inefficiently, tending to estimate only the more recent ages, as here in the case of admixture events across the Red Sea (∼8–37 generations for African input into Arabia, and 30–90 generations for “back-to-Africa” migrations). An mtDNA-based founder analysis, corroborated by detailed analysis of the whole-mtDNA genome, affords an alternative means by which to identify, date and quantify multiple migration events at greater time depths, across the full range of modern human history, albeit for the maternal line of descent only. In Arabia, this approach enables us to infer several major pulses of dispersal between the Near East and Arabia, most likely via the Gulf corridor. Although some relict lineages survive in Arabia from the time of the out-of-Africa dispersal, 60 ka, the major episodes in the peopling of the Peninsula took place from north to south in the Late Glacial and, to a lesser extent, the immediate post-glacial/Neolithic. Exchanges across the Red Sea were mainly due to the Arab slave trade and maritime dominance (from ∼2.5 ka to very recent times), but had already begun by the early Holocene, fuelled by the establishment of maritime networks since ∼8 ka. The main “back-to-Africa” migrations, again undetected by genome-wide dating analyses, occurred in the Late Glacial period for introductions into eastern Africa, whilst the Neolithic was more significant for migrations towards North Africa.
Collapse
Affiliation(s)
- Verónica Fernandes
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Petr Triska
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
- Instituto de Ciências Biomédicas da Universidade do Porto (ICBAS), Porto, Portugal
| | - Joana B. Pereira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Farida Alshamali
- General Department of Forensic Sciences and Criminology, Dubai Police General Headquarters, Dubai, United Arab Emirates
| | - Teresa Rito
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
| | - Alison Machado
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
| | - Zuzana Fajkošová
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
- Archaeogenetics Laboratory, Institute of Archaeology of the Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Bruno Cavadas
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
| | - Viktor Černý
- Archaeogenetics Laboratory, Institute of Archaeology of the Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Pedro Soares
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
| | - Martin B. Richards
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
- Department of Biological Sciences, School of Applied Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Luísa Pereira
- Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), Porto, Portugal
- Faculdade de Medicina da Universidade do Porto, Porto, Portugal
- * E-mail:
| |
Collapse
|
39
|
Saad MN, Mabrouk MS, Eldeib AM, Shaker OG. Identification of rheumatoid arthritis biomarkers based on single nucleotide polymorphisms and haplotype blocks: A systematic review and meta-analysis. J Adv Res 2015; 7:1-16. [PMID: 26843965 PMCID: PMC4703421 DOI: 10.1016/j.jare.2015.01.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Revised: 01/13/2015] [Accepted: 01/20/2015] [Indexed: 12/30/2022] Open
Abstract
Genetics of autoimmune diseases represent a growing domain with surpassing biomarker results with rapid progress. The exact cause of Rheumatoid Arthritis (RA) is unknown, but it is thought to have both a genetic and an environmental bases. Genetic biomarkers are capable of changing the supervision of RA by allowing not only the detection of susceptible individuals, but also early diagnosis, evaluation of disease severity, selection of therapy, and monitoring of response to therapy. This review is concerned with not only the genetic biomarkers of RA but also the methods of identifying them. Many of the identified genetic biomarkers of RA were identified in populations of European and Asian ancestries. The study of additional human populations may yield novel results. Most of the researchers in the field of identifying RA biomarkers use single nucleotide polymorphism (SNP) approaches to express the significance of their results. Although, haplotype block methods are expected to play a complementary role in the future of that field.
Collapse
Affiliation(s)
- Mohamed N Saad
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology, 6th of October City, Egypt
| | - Mai S Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology, 6th of October City, Egypt
| | - Ayman M Eldeib
- Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt
| | - Olfat G Shaker
- Medical Biochemistry and Molecular Biology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
40
|
Qiu J, Zhu J, Fu F, Ye CY, Wang W, Mao L, Lin Z, Chen L, Zhang H, Guo L, Qiang S, Lu Y, Fan L. Genome re-sequencing suggested a weedy rice origin from domesticated indica-japonica hybridization: a case study from southern China. PLANTA 2014; 240:1353-1363. [PMID: 25187076 DOI: 10.1007/s00425-014-2159-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Accepted: 08/16/2014] [Indexed: 06/03/2023]
Abstract
Whole-genome re-sequencing of weedy rice from southern China reveals that weedy rice can originate from hybridization of domesticated indica and japonica rice. Weedy rice (Oryza sativa f. spontanea Rosh.), which harbors phenotypes of both wild and domesticated rice, has become one of the most notorious weeds in rice fields worldwide. While its formation is poorly understood, massive amounts of rice genomic data may provide new insights into this issue. In this study, we determined genomes of three weedy rice samples from the lower Yangtze region, China, and investigated their phylogenetics, population structure and chromosomal admixture patterns. The phylogenetic tree and principle component analysis based on 46,005 SNPs with 126 other Oryza accessions suggested that the three weedy rice accessions were intermediate between japonica and indica rice. An ancestry inference study further demonstrated that weedy rice had two dominant genomic components (temperate japonica and indica). This strongly suggests that weedy rice originated from indica-japonica hybridization. Furthermore, 22,443 novel fixed single nucleotide polymorphisms were detected in the weedy genomes and could have been generated after indica-japonica hybridization for environmental adaptation.
Collapse
Affiliation(s)
- Jie Qiu
- Department of Agronomy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China,
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Duggan AT, Stoneking M. Recent developments in the genetic history of East Asia and Oceania. Curr Opin Genet Dev 2014; 29:9-14. [PMID: 25170982 DOI: 10.1016/j.gde.2014.06.010] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Accepted: 06/30/2014] [Indexed: 01/11/2023]
Abstract
Recent developments in our understanding of the genetic history of Asia and Oceania have been driven by technological advances. Specifically, our understanding of the past has been augmented by: genome sequences from ancient hominins and ancient modern humans; more comprehensive studies of existing populations (e.g., complete mtDNA genome sequences and genome-wide data) and the development of new statistics and analytical methods to interpret the abundance of new data. We review some of the new discoveries since we entered the age of archaic and modern genomics and how they have changed our understanding of the settlement and subsequent population dynamics in Asia and the Pacific.
Collapse
Affiliation(s)
- Ana T Duggan
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D04103 Leipzig, Germany
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D04103 Leipzig, Germany.
| |
Collapse
|
42
|
Hellenthal G, Busby GB, Band G, Wilson JF, Capelli C, Falush D, Myers S. A genetic atlas of human admixture history. Science 2014; 343:747-751. [PMID: 24531965 PMCID: PMC4209567 DOI: 10.1126/science.1243518] [Citation(s) in RCA: 476] [Impact Index Per Article: 47.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years. We identified events whose dates and participants suggest they describe genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.
Collapse
Affiliation(s)
- Garrett Hellenthal
- UCL Genetics Institute, University College London, Gower Street, London WC1E 6BT, UK
| | - George B.J. Busby
- Department of Zoology, Oxford University, South Parks Road, Oxford OX1 3PS, UK
| | - Gavin Band
- Wellcome Trust Centre for Human Genetics, Oxford University, Roosevelt Drive, Oxford OX3 7BN, UK
| | - James F. Wilson
- Centre for Population Health Sciences, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, UK
| | - Cristian Capelli
- Department of Zoology, Oxford University, South Parks Road, Oxford OX1 3PS, UK
| | - Daniel Falush
- Max Planck Institute for Evolutionary Anthropology, DeutscherPlatz 6, 04103 Leipzig, Germany
| | - Simon Myers
- Wellcome Trust Centre for Human Genetics, Oxford University, Roosevelt Drive, Oxford OX3 7BN, UK
- Department of Statistics, Oxford University, 1 South Parks Road, Oxford OX1 3TG, UK
| |
Collapse
|
43
|
Rogers MB, Downing T, Smith BA, Imamura H, Sanders M, Svobodova M, Volf P, Berriman M, Cotton JA, Smith DF. Genomic confirmation of hybridisation and recent inbreeding in a vector-isolated Leishmania population. PLoS Genet 2014; 10:e1004092. [PMID: 24453988 PMCID: PMC3894156 DOI: 10.1371/journal.pgen.1004092] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 11/20/2013] [Indexed: 12/02/2022] Open
Abstract
Although asexual reproduction via clonal propagation has been proposed as the principal reproductive mechanism across parasitic protozoa of the Leishmania genus, sexual recombination has long been suspected, based on hybrid marker profiles detected in field isolates from different geographical locations. The recent experimental demonstration of a sexual cycle in Leishmania within sand flies has confirmed the occurrence of hybridisation, but knowledge of the parasite life cycle in the wild still remains limited. Here, we use whole genome sequencing to investigate the frequency of sexual reproduction in Leishmania, by sequencing the genomes of 11 Leishmania infantum isolates from sand flies and 1 patient isolate in a focus of cutaneous leishmaniasis in the Çukurova province of southeast Turkey. This is the first genome-wide examination of a vector-isolated population of Leishmania parasites. A genome-wide pattern of patchy heterozygosity and SNP density was observed both within individual strains and across the whole group. Comparisons with other Leishmania donovani complex genome sequences suggest that these isolates are derived from a single cross of two diverse strains with subsequent recombination within the population. This interpretation is supported by a statistical model of the genomic variability for each strain compared to the L. infantum reference genome strain as well as genome-wide scans for recombination within the population. Further analysis of these heterozygous blocks indicates that the two parents were phylogenetically distinct. Patterns of linkage disequilibrium indicate that this population reproduced primarily clonally following the original hybridisation event, but that some recombination also occurred. This observation allowed us to estimate the relative rates of sexual and asexual reproduction within this population, to our knowledge the first quantitative estimate of these events during the Leishmania life cycle. Sexual reproduction is predicted to be a rare event in Leishmania parasites, as evidenced by detection of rare parasite hybrids in natural populations using molecular methods. Recently, a sexual cycle has been detected experimentally in parasites within the sand fly vector (that transmits this pathogenic microorganism to mammalian species including man, causing human leishmaniasis). In this study, we have used whole genome sequencing to investigate genetic variation at the highest level of resolution in Leishmania parasites isolated from sand flies in a defined focus of leishmaniasis in southeast Turkey. Using a range of analytical tools, we show that variation in these parasites arose following a single cross between two diverse strains and subsequent recombination between the progeny, despite mainly clonal reproduction in the parasite population. We have thus been able to derive quantitative estimates of the relative rates of sexual and asexual reproduction during the Leishmania life cycle for the first time, information that will be critical to our understanding of the epidemiology and evolution of this genus.
Collapse
Affiliation(s)
- Matthew B. Rogers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
- Centre for Immunology and Infection, Department of Biology, University of York, York, United Kingdom
| | - Tim Downing
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Barbara A. Smith
- Centre for Immunology and Infection, Department of Biology, University of York, York, United Kingdom
| | - Hideo Imamura
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
- Unit of Molecular Parasitology, Department of Parasitology, Institute of Tropical Medicine, Antwerp, Belgium
| | - Mandy Sanders
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Milena Svobodova
- Department of Parasitology, Fac. Sci., Charles University, Prague, Czech Republic
| | - Petr Volf
- Department of Parasitology, Fac. Sci., Charles University, Prague, Czech Republic
| | - Matthew Berriman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - James A. Cotton
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
- * E-mail: (JAC); (DFS)
| | - Deborah F. Smith
- Centre for Immunology and Infection, Department of Biology, University of York, York, United Kingdom
- * E-mail: (JAC); (DFS)
| |
Collapse
|
44
|
Brisbin A, Bryc K, Byrnes J, Zakharia F, Omberg L, Degenhardt J, Reynolds A, Ostrer H, Mezey JG, Bustamante CD. PCAdmix: principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. Hum Biol 2013; 84:343-64. [PMID: 23249312 DOI: 10.3378/027.084.0401] [Citation(s) in RCA: 121] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Identifying ancestry along each chromosome in admixed individuals provides a wealth of information for understanding the population genetic history of admixture events and is valuable for admixture mapping and identifying recent targets of selection. We present PCAdmix (available at https://sites.google.com/site/pcadmix/home ), a Principal Components-based algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals. We compare our method to HAPMIX on simulated data from two ancestral populations, and we find high concordance between the methods. Our method also has better accuracy than LAMP when applied to three-population admixture, a situation as yet unaddressed by HAPMIX. Finally, we apply our method to a data set of four Latino populations with European, African, and Native American ancestry. We find evidence of assortative mating in each of the four populations, and we identify regions of shared ancestry that may be recent targets of selection and could serve as candidate regions for admixture-based association mapping.
Collapse
Affiliation(s)
- Abra Brisbin
- Department of Biostatistics and Computational Biology, Cornell University, Ithaca, NY, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Jin W, Li R, Zhou Y, Xu S. Distribution of ancestral chromosomal segments in admixed genomes and its implications for inferring population history and admixture mapping. Eur J Hum Genet 2013; 22:930-7. [PMID: 24253859 DOI: 10.1038/ejhg.2013.265] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2013] [Revised: 10/01/2013] [Accepted: 10/10/2013] [Indexed: 12/28/2022] Open
Abstract
The ancestral chromosomal segments in admixed genomes are of significant importance for both population history inference and admixture mapping, because they essentially provide the basic information for tracking genetic events. However, the distributions of the lengths of ancestral chromosomal segments (LACS) under some admixture models remain poorly understood. Here we introduced a theoretical framework on the distribution of LACS in two representative admixture models, that is, hybrid isolation (HI) model and gradual admixture (GA) model. Although the distribution of LACS in the GA model differs from that in the HI model, we demonstrated that the mean LACS in the HI model is approximately half of that in the GA model if both admixture proportion and admixture time in the two models are identical. We showed that the theoretical framework greatly facilitated the inference and understanding of population admixture history by analyzing African-American and Mexican empirical data. In addition, we found the peak of association signatures in the HI model was much narrower and sharper than that in the GA model, indicating that the identification of putative causal allele in the HI model is more efficient than that in the GA model. Thus admixture mapping with case-only data would be a reasonable and economical choice in the HI model due to the weak background noise. However, according to our previous studies, many populations are likely to be gradually admixed and have pretty high background linkage disequilibrium. Therefore, we suggest using a case-control approach rather than a case-only approach to conduct admixture mapping to retain the statistics power in recently admixed populations.
Collapse
Affiliation(s)
- Wenfei Jin
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ran Li
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ying Zhou
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Shuhua Xu
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
46
|
Nguyen N, Vo A, Won KJ. A wavelet-based method to exploit epigenomic language in the regulatory region. ACTA ACUST UNITED AC 2013; 30:908-14. [PMID: 24096080 DOI: 10.1093/bioinformatics/btt467] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
MOTIVATION Epigenetic landscapes in the regulatory regions reflect binding condition of transcription factors and their co-factors. Identifying epigenetic condition and its variation is important in understanding condition-specific gene regulation. Computational approaches to explore complex multi-dimensional landscapes are needed. RESULTS To study epigenomic condition for gene regulation, we developed a method, AWNFR, to classify epigenomic landscapes based on the detected epigenomic landscapes. Assuming mixture of Gaussians for a nucleosome, the proposed method captures the shape of histone modification and identifies potential regulatory regions in the wavelet domain. For accuracy estimation as well as enhanced computational speed, we developed a novel algorithm based on down-sampling operation and footprint in wavelet. We showed the algorithmic advantages of AWNFR using the simulated data. AWNFR identified regulatory regions more effectively and accurately than the previous approaches with the epigenome data in mouse embryonic stem cells and human lung fibroblast cells (IMR90). Based on the detected epigenomic landscapes, AWNFR classified epigenomic status and studied epigenomic codes. We studied co-occurring histone marks and showed that AWNFR captures the epigenomic variation across time. AVAILABILITY AND IMPLEMENTATION The source code and supplemental document of AWNFR are available at http://wonk.med.upenn.edu/AWNFR.
Collapse
Affiliation(s)
- Nha Nguyen
- Department of Genetics, Institute for Diabetes, Obesity and Metabolism, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA and Center for Neurosciences, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA
| | | | | |
Collapse
|
47
|
Reply to Price and Bird: No inconsistency between the date of gene flow from India and the Australian archaeological record. Proc Natl Acad Sci U S A 2013; 110:E2949. [PMID: 24073425 DOI: 10.1073/pnas.1307961110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
48
|
Sousa V, Hey J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet 2013; 14:404-14. [PMID: 23657479 DOI: 10.1038/nrg3446] [Citation(s) in RCA: 181] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
As it becomes easier to sequence multiple genomes from closely related species, evolutionary biologists working on speciation are struggling to get the most out of very large population genomic data sets. Such data hold the potential to resolve long-standing questions in evolutionary biology about the role of gene exchange in species formation. In principle, the new population genomic data can be used to disentangle the conflicting roles of natural selection and gene flow during the divergence process. However, there are great challenges in taking full advantage of such data, especially with regard to including recombination in genetic models of the divergence process. Current data, models, methods and the potential pitfalls in using them will be considered here.
Collapse
Affiliation(s)
- Vitor Sousa
- Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, New Jersey 08854, USA
| | | |
Collapse
|
49
|
Abstract
Long-range migrations and the resulting admixtures between populations have been important forces shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture events that previous formal tests cannot. We further show that we can uncover phylogenetic relationships among populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the calculations. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese.
Collapse
|
50
|
Pugach I, Delfin F, Gunnarsdóttir E, Kayser M, Stoneking M. Genome-wide data substantiate Holocene gene flow from India to Australia. Proc Natl Acad Sci U S A 2013; 110:1803-8. [PMID: 23319617 PMCID: PMC3562786 DOI: 10.1073/pnas.1211927110] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The Australian continent holds some of the earliest archaeological evidence for the expansion of modern humans out of Africa, with initial occupation at least 40,000 y ago. It is commonly assumed that Australia remained largely isolated following initial colonization, but the genetic history of Australians has not been explored in detail to address this issue. Here, we analyze large-scale genotyping data from aboriginal Australians, New Guineans, island Southeast Asians and Indians. We find an ancient association between Australia, New Guinea, and the Mamanwa (a Negrito group from the Philippines), with divergence times for these groups estimated at 36,000 y ago, and supporting the view that these populations represent the descendants of an early "southern route" migration out of Africa, whereas other populations in the region arrived later by a separate dispersal. We also detect a signal indicative of substantial gene flow between the Indian populations and Australia well before European contact, contrary to the prevailing view that there was no contact between Australia and the rest of the world. We estimate this gene flow to have occurred during the Holocene, 4,230 y ago. This is also approximately when changes in tool technology, food processing, and the dingo appear in the Australian archaeological record, suggesting that these may be related to the migration from India.
Collapse
Affiliation(s)
- Irina Pugach
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany.
| | | | | | | | | |
Collapse
|