1
|
Zhang Y, Zhang H, Wu Y. A general approach for inferring the ancestry of recent ancestors of an admixed individual. Proc Natl Acad Sci U S A 2024; 121:e2316242120. [PMID: 38165936 PMCID: PMC10786287 DOI: 10.1073/pnas.2316242120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 11/27/2023] [Indexed: 01/04/2024] Open
Abstract
The genome of an individual from an admixed population consists of segments originated from different ancestral populations. Most existing ancestry inference approaches focus on calling these segments for the extant individual. In this paper, we present a general ancestry inference approach for inferring recent ancestors from an extant genome. Given the genome of an individual from a recently admixed population, our method can estimate the proportions of the genomes of the recent ancestors of this individual that originated from some ancestral populations. The key step of our method is the inference of ancestors (called founders) right after the formation of an admixed population. The inferred founders can then be used to infer the ancestry of recent ancestors of an extant individual. Our method is implemented in a computer program called PedMix2. To the best of our knowledge, there is no existing method that can practically infer ancestors beyond grandparents from an extant individual's genome. Results on both simulated and real data show that PedMix2 performs well in ancestry inference.
Collapse
Affiliation(s)
- Yiming Zhang
- School of Computing, College of Engineering, University of Connecticut, Storrs, CT06269
| | - Haotian Zhang
- School of Computing, College of Engineering, University of Connecticut, Storrs, CT06269
| | - Yufeng Wu
- School of Computing, College of Engineering, University of Connecticut, Storrs, CT06269
| |
Collapse
|
2
|
Garcia-Erill G, Hanghøj K, Heller R, Wiuf C, Albrechtsen A. Estimating admixture pedigrees of recent hybrids without a contiguous reference genome. Mol Ecol Resour 2023; 23:1604-1619. [PMID: 37400991 DOI: 10.1111/1755-0998.13830] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/30/2023] [Accepted: 06/15/2023] [Indexed: 07/05/2023]
Abstract
The genome of recently admixed individuals or hybrids has characteristic genetic patterns that can be used to learn about their recent admixture history. One of these are patterns of interancestry heterozygosity, which can be inferred from SNP data from either called genotypes or genotype likelihoods, without the need for information on genomic location. This makes them applicable to a wide range of data that are often used in evolutionary and conservation genomic studies, such as low-depth sequencing mapped to scaffolds and reduced representation sequencing. Here we implement maximum likelihood estimation of interancestry heterozygosity patterns using two complementary models. We furthermore develop apoh (Admixture Pedigrees of Hybrids), a software that uses estimates of paired ancestry proportions to detect recently admixed individuals or hybrids, and to suggest possible admixture pedigrees. It furthermore calculates several hybrid indices that make it easier to identify and rank possible admixture pedigrees that could give rise to the estimated patterns. We implemented apoh both as a command line tool and as a Graphical User Interface that allows the user to automatically and interactively explore, rank and visualize compatible recent admixture pedigrees, and calculate the different summary indices. We validate the performance of the method using admixed family trios from the 1000 Genomes Project. In addition, we show its applicability on identifying recent hybrids from RAD-seq data of Grant's gazelle (Nanger granti and Nanger petersii) and whole genome low-depth data of waterbuck (Kobus ellipsiprymnus) which shows complex admixture of up to four populations.
Collapse
Affiliation(s)
| | - Kristian Hanghøj
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Heller
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Carsten Wiuf
- Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
3
|
Avadhanam S, Williams AL. Simultaneous inference of parental admixture proportions and admixture times from unphased local ancestry calls. Am J Hum Genet 2022; 109:1405-1420. [PMID: 35908549 PMCID: PMC9388397 DOI: 10.1016/j.ajhg.2022.06.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 06/24/2022] [Indexed: 02/06/2023] Open
Abstract
Population genetic analyses of local ancestry tracts routinely assume that the ancestral admixture process is identical for both parents of an individual, an assumption that may be invalid when considering recent admixture. Here, we present Parental Admixture Proportion Inference (PAPI), a Bayesian tool for inferring the admixture proportions and admixture times for each parent of a single admixed individual. PAPI analyzes unphased local ancestry tracts and has two components: a binomial model that leverages genome-wide ancestry fractions to infer parental admixture proportions and a hidden Markov model (HMM) that infers admixture times from tract lengths. Crucially, the HMM accounts for unobserved within-ancestry recombination by approximating the pedigree crossover dynamics, enabling inference of parental admixture times. In simulations, we find that PAPI's admixture proportion estimates deviate from the truth by 0.047 on average, outperforming ANCESTOR and PedMix by 46.0% and 57.6%, respectively. Moreover, PAPI's admixture time estimates were strongly correlated with the truth (R=0.76) but have an average downward bias of 1.01 generations that is partly attributable to inaccuracies in local ancestry inference. As an illustration of its utility, we ran PAPI on African American genotypes from the PAGE study (N = 5,786) and found strong evidence of assortative mating by ancestry proportion: couples' ancestry proportions are highly correlated (R = 0.87) and are closer to each other than expected under random mating (p < 10-6). We anticipate that PAPI will be useful in studying the population dynamics of admixture and will also be of interest to individuals seeking to learn about their personal genealogies.
Collapse
Affiliation(s)
- Siddharth Avadhanam
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | - Amy L Williams
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
4
|
Joint inference of ancestry and genotypes of parents from children. iScience 2022; 25:104768. [PMID: 35942102 PMCID: PMC9356179 DOI: 10.1016/j.isci.2022.104768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 05/18/2022] [Accepted: 07/11/2022] [Indexed: 12/02/2022] Open
Abstract
In this paper, we address a problem: can we perform ancestry inference for parents from one or more children’s DNA samples? That is, suppose the parents’ genomes consist of segments of different ancestry, and our goal is inferring parental ancestry and at the same time, calling parental genotypes from given children’s genetic data. Such ancestry inference may provide insights into recent ancestors from children’s genomes, and potentially has applications in understanding genetic traits. At present, there exists no method for this inference problem. We present parMix, a method based on hidden Markov model (HMM) that can jointly infer parental ancestry and call parental genotypes from data of a small number of children. Simulation results show that parMix performs well in practice. It can provide reasonably accurate parental inference given data from a small number (say three) of children. parMix becomes more accurate when data from more children are used. Presented a method for inferring ancestry and genotypes of parents from children Recombination events can be detected using parMix parMix can deal with the genotypes with phasing errors parMix can be used to infer admixture proportion of parents
Collapse
|
5
|
Gopalan S, Smith SP, Korunes K, Hamid I, Ramachandran S, Goldberg A. Human genetic admixture through the lens of population genomics. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200410. [PMID: 35430881 PMCID: PMC9014191 DOI: 10.1098/rstb.2020.0410] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 03/24/2022] [Indexed: 12/13/2022] Open
Abstract
Over the past 50 years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, 'The Apportionment of Human Diversity'. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene flow between two or more previously separated source populations to form a new admixed population. The admixture process introduces ancestry-based structure into patterns of genetic variation within and between populations, which in turn influences the inference of demographic histories, identification of genetic targets of selection and prediction of complex traits. In this review, we outline some challenges for admixture population genetics, including limitations of applying methods designed for populations without recent admixture to the study of admixed populations. We highlight recent studies and methodological advances that aim to overcome such challenges, leveraging genomic signatures of admixture that occurred in the past tens of generations to gain insights into human history, natural selection and complex trait architecture. This article is part of the theme issue 'Celebrating 50 years since Lewontin's apportionment of human diversity'.
Collapse
Affiliation(s)
- Shyamalika Gopalan
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Katharine Korunes
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Iman Hamid
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912, USA
- Data Science Initiative, Brown University, Providence, RI 02912, USA
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
6
|
Pfaffelhuber P, Sester-Huss E, Baumdicker F, Naue J, Lutz-Bonengel S, Staubach F. Inference of recent admixture using genotype data. Forensic Sci Int Genet 2021; 56:102593. [PMID: 34735936 DOI: 10.1016/j.fsigen.2021.102593] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 07/30/2021] [Accepted: 09/07/2021] [Indexed: 12/23/2022]
Abstract
The inference of biogeographic ancestry (BGA) has become a focus of forensic genetics. Misinference of BGA can have profound unwanted consequences for investigations and society. We show that recent admixture can lead to misclassification and erroneous inference of ancestry proportions, using state of the art analysis tools with (i) simulations, (ii) 1000 genomes project data, and (iii) two individuals analyzed using the ForenSeq DNA Signature Prep Kit. Subsequently, we extend existing tools for estimation of individual ancestry (IA) by allowing for different IA in both parents, leading to estimates of parental individual ancestry (PIA), and a statistical test for recent admixture. Estimation of PIA outperforms IA in most scenarios of recent admixture. Furthermore, additional information about parental ancestry can be acquired with PIA that may guide casework.
Collapse
Affiliation(s)
- Peter Pfaffelhuber
- Institute for Mathematics, University of Freiburg, Ernst-Zermelo-Str. 1, 79104 Freiburg, Germany.
| | - Elisabeth Sester-Huss
- Institute for Mathematics, University of Freiburg, Ernst-Zermelo-Str. 1, 79104 Freiburg, Germany
| | - Franz Baumdicker
- Cluster of Excellence CMFI, Mathematical and Computational Population Genetics, University of Tübingen, Sand 14, 72076 Tübingen, Germany
| | - Jana Naue
- Institute of Forensic Medicine, Medical Center, Faculty of Medicine, University of Freiburg, Albertstraße 9, 79104 Freiburg, Germany
| | - Sabine Lutz-Bonengel
- Institute of Forensic Medicine, Medical Center, Faculty of Medicine, University of Freiburg, Albertstraße 9, 79104 Freiburg, Germany
| | - Fabian Staubach
- Biology I, Evolution & Ecology, University of Freiburg, Hauptstraße 1, 79104 Freiburg, Germany
| |
Collapse
|