1
|
Caliebe A, Tekola‐Ayele F, Darst BF, Wang X, Song YE, Gui J, Sebro RA, Balding DJ, Saad M, Dubé M. Including diverse and admixed populations in genetic epidemiology research. Genet Epidemiol 2022; 46:347-371. [PMID: 35842778 PMCID: PMC9452464 DOI: 10.1002/gepi.22492] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/31/2022] [Accepted: 06/06/2022] [Indexed: 11/25/2022]
Abstract
The inclusion of ancestrally diverse participants in genetic studies can lead to new discoveries and is important to ensure equitable health care benefit from research advances. Here, members of the Ethical, Legal, Social, Implications (ELSI) committee of the International Genetic Epidemiology Society (IGES) offer perspectives on methods and analysis tools for the conduct of inclusive genetic epidemiology research, with a focus on admixed and ancestrally diverse populations in support of reproducible research practices. We emphasize the importance of distinguishing socially defined population categorizations from genetic ancestry in the design, analysis, reporting, and interpretation of genetic epidemiology research findings. Finally, we discuss the current state of genomic resources used in genetic association studies, functional interpretation, and clinical and public health translation of genomic findings with respect to diverse populations.
Collapse
Affiliation(s)
- Amke Caliebe
- Institute of Medical Informatics and StatisticsKiel University and University Hospital Schleswig‐HolsteinKielGermany
| | - Fasil Tekola‐Ayele
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human DevelopmentNational Institutes of HealthBethesdaMarylandUSA
| | - Burcu F. Darst
- Center for Genetic EpidemiologyUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
- Public Health Sciences DivisionFred Hutchinson Cancer Research CenterSeattleWashingtonUSA
| | - Xuexia Wang
- Department of MathematicsUniversity of North TexasDentonTexasUSA
| | - Yeunjoo E. Song
- Department of Population and Quantitative Health SciencesCase Western Reserve UniversityClevelandOhioUSA
| | - Jiang Gui
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth CollegeOne Medical Center Dr.LebanonNew HampshireUSA
| | | | - David J. Balding
- Melbourne Integrative Genomics, Schools of BioSciences and of Mathematics & StatisticsUniversity of MelbourneMelbourneAustralia
| | - Mohamad Saad
- Qatar Computing Research InstituteHamad Bin Khalifa UniversityDohaQatar
- Neuroscience Research Center, Faculty of Medical SciencesLebanese UniversityBeirutLebanon
| | - Marie‐Pierre Dubé
- Department of Medicine, and Social and Preventive MedicineUniversité de MontréalMontréalQuébecCanada
- Beaulieu‐Saucier Pharmacogenomcis CentreMontreal Heart InstituteMontrealCanada
| | | |
Collapse
|
2
|
Utsunomiya YT, Fortunato AAAD, Milanesi M, Trigo BB, Alves NF, Sonstegard TS, Garcia JF. Bos taurus haplotypes segregating in Nellore (Bos indicus) cattle. Anim Genet 2021; 53:58-67. [PMID: 34921423 DOI: 10.1111/age.13164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/10/2021] [Indexed: 11/29/2022]
Abstract
Brazil is the largest exporter of beef in the world, and most of that beef derives from Nellore cattle. Although considered a zebu breed (Bos indicus), the history of Nellore cattle in Brazil is marked by the importation of bulls from India, the use of a Creole taurine (Bos taurus) maternal lineage to quickly expand the herds and backcrossing to Nellore bulls to recover zebu ancestry. As a consequence, the current Brazilian Nellore population carries an average taurine ancestry of approximately 1%. Although that percentage seems small, some taurine variants deviate substantially from that average, with the better-known cases being the PLAG1-Q haplotype involved with body size variation and the Guarani (PG ) polled variant producing hornless animals. Here, we report taurine haplotypes in 9074 Nellore animals genotyped for 539 657 imputed SNP markers. Apart from PLAG1-Q and PG , our analysis further revealed common taurine haplotypes (>3%) spanning genes related to immunity, growth, reproduction and hair and skin phenotypes. Using data from 22 economically important traits, we showed that many of the major QTL previously reported in the breed are at least partially driven by taurine haplotypes. As B. taurus and B. indicus haplotypes are highly divergent, presenting widely different sets of functional variants, our results provide promising targets for future scrutiny in Nellore cattle.
Collapse
Affiliation(s)
- Y T Utsunomiya
- Department of Production and Animal Health, School of Veterinary Medicine of Araçatuba, São Paulo State University, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,AgroPartners Consulting. R. Floriano Peixoto, 120 - Sala 43A - Centro, Araçatuba, SP, 16010-220, Brazil
| | - A A A D Fortunato
- Department of Production and Animal Health, School of Veterinary Medicine of Araçatuba, São Paulo State University, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,Personal-PEC. R. Sebastião Lima, 1336 - Centro, Campo Grande, MS, 79004-600, Brazil
| | - M Milanesi
- AgroPartners Consulting. R. Floriano Peixoto, 120 - Sala 43A - Centro, Araçatuba, SP, 16010-220, Brazil.,Department for Innovation in Biological, Agro-Food and Forest Systems, Università Della Tuscia, Via S. Camillo de Lellis snc, Viterbo, 01100, Italy
| | - B B Trigo
- Department of Production and Animal Health, School of Veterinary Medicine of Araçatuba, São Paulo State University, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil
| | - N F Alves
- Department of Production and Animal Health, School of Veterinary Medicine of Araçatuba, São Paulo State University, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil
| | | | - J F Garcia
- Department of Production and Animal Health, School of Veterinary Medicine of Araçatuba, São Paulo State University, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clovis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,AgroPartners Consulting. R. Floriano Peixoto, 120 - Sala 43A - Centro, Araçatuba, SP, 16010-220, Brazil.,Department of Preventive Veterinary Medicine and Animal Reproduction, School of Agricultural and Veterinarian Sciences, São Paulo State University, 14884-900 Via de Acesso Prof. Paulo Donato Castellane s/n, Jaboticabal, SP, Brazil
| |
Collapse
|
3
|
Gebrehiwot NZ, Aliloo H, Strucken EM, Marshall K, Al Kalaldeh M, Missohou A, Gibson JP. Inference of Ancestries and Heterozygosity Proportion and Genotype Imputation in West African Cattle Populations. Front Genet 2021; 12:584355. [PMID: 33841491 PMCID: PMC8025404 DOI: 10.3389/fgene.2021.584355] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 02/22/2021] [Indexed: 11/24/2022] Open
Abstract
Several studies have evaluated computational methods that infer the haplotypes from population genotype data in European cattle populations. However, little is known about how well they perform in African indigenous and crossbred populations. This study investigates: (1) global and local ancestry inference; (2) heterozygosity proportion estimation; and (3) genotype imputation in West African indigenous and crossbred cattle populations. Principal component analysis (PCA), ADMIXTURE, and LAMP-LD were used to analyse a medium-density single nucleotide polymorphism (SNP) dataset from Senegalese crossbred cattle. Reference SNP data of East and West African indigenous and crossbred cattle populations were used to investigate the accuracy of imputation from low to medium-density and from medium to high-density SNP datasets using Minimac v3. The first two principal components differentiated Bos indicus from European Bos taurus and African Bos taurus from other breeds. Irrespective of assuming two or three ancestral breeds for the Senegalese crossbreds, breed proportion estimates from ADMIXTURE and LAMP-LD showed a high correlation (r ≥ 0.981). The observed ancestral origin heterozygosity proportion in putative F1 crosses was close to the expected value of 1.0, and clearly differentiated F1 from all other crosses. The imputation accuracies (estimated as correlation) between imputed and the real data in crossbred animals ranged from 0.142 to 0.717 when imputing from low to medium-density, and from 0.478 to 0.899 for imputation from medium to high-density. The imputation accuracy was generally higher when the reference data came from the same geographical region as the target population, and when crossbred reference data was used to impute crossbred genotypes. The lowest imputation accuracies were observed for indigenous breed genotypes. This study shows that ancestral origin heterozygosity can be estimated with high accuracy and will be far superior to the use of observed individual heterozygosity for estimating heterosis in African crossbred populations. It was not possible to achieve high imputation accuracy in West African crossbred or indigenous populations based on reference data sets from East Africa, and population-specific genotyping with high-density SNP assays is required to improve imputation.
Collapse
Affiliation(s)
- Netsanet Z Gebrehiwot
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| | - Hassan Aliloo
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| | - Eva M Strucken
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| | - Karen Marshall
- International Livestock Research Institute and Centre for Tropical Livestock Genetics and Health, Nairobi, Kenya
| | - Mohammad Al Kalaldeh
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| | - Ayao Missohou
- L'École Inter-États des Sciences et Médecine Vétérinaires de Dakar (EISMV), Dakar, Senegal
| | - John P Gibson
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW, Australia
| |
Collapse
|
4
|
Utsunomiya YT, Milanesi M, Barbato M, Utsunomiya ATH, Sölkner J, Ajmone‐Marsan P, Garcia JF. Unsupervised detection of ancestry tracks with the GHap
r
package. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13467] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Yuri Tani Utsunomiya
- Department of Support, Production and Animal Health School of Veterinary Medicine of Araçatuba São Paulo State University (Unesp) Araçatuba/SP Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics Araçatuba/SP Brazil
| | - Marco Milanesi
- Department of Support, Production and Animal Health School of Veterinary Medicine of Araçatuba São Paulo State University (Unesp) Araçatuba/SP Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics Araçatuba/SP Brazil
| | - Mario Barbato
- Department of Animal Science Food and Nutrition—DIANA and Nutrigenomics and Proteomics Research Center Università Cattolica del Sacro Cuore Piacenza Italy
| | - Adam Taiti Harth Utsunomiya
- Department of Support, Production and Animal Health School of Veterinary Medicine of Araçatuba São Paulo State University (Unesp) Araçatuba/SP Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics Araçatuba/SP Brazil
| | - Johann Sölkner
- Division of Livestook Sciences Department of Sustainable Agriculture System BOKU—University of Natural Resources and Life Sciences Vienna Austria
| | - Paolo Ajmone‐Marsan
- Department of Animal Science Food and Nutrition—DIANA and Nutrigenomics and Proteomics Research Center Università Cattolica del Sacro Cuore Piacenza Italy
| | - José Fernando Garcia
- Department of Support, Production and Animal Health School of Veterinary Medicine of Araçatuba São Paulo State University (Unesp) Araçatuba/SP Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics Araçatuba/SP Brazil
- Department of Preventive Veterinary Medicine and Animal Reproduction School of Agricultural and Veterinarian Sciences São Paulo State University (Unesp) Jaboticabal/SP Brazil
| |
Collapse
|
5
|
Geza E, Mugo J, Mulder NJ, Wonkam A, Chimusa ER, Mazandu GK. A comprehensive survey of models for dissecting local ancestry deconvolution in human genome. Brief Bioinform 2020; 20:1709-1724. [PMID: 30010715 DOI: 10.1093/bib/bby044] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Revised: 04/16/2018] [Indexed: 11/14/2022] Open
Abstract
Over the past decade, studies of admixed populations have increasingly gained interest in both medical and population genetics. These studies have so far shed light on the patterns of genetic variation throughout modern human evolution and have improved our understanding of the demographics and adaptive processes of human populations. To date, there exist about 20 methods or tools to deconvolve local ancestry. These methods have merits and drawbacks in estimating local ancestry in multiway admixed populations. In this article, we survey existing ancestry deconvolution methods, with special emphasis on multiway admixture, and compare these methods based on simulation results reported by different studies, computational approaches used, including mathematical and statistical models, and biological challenges related to each method. This should orient users on the choice of an appropriate method or tool for given population admixture characteristics and update researchers on current advances, challenges and opportunities behind existing ancestry deconvolution methods.
Collapse
Affiliation(s)
- Ephifania Geza
- African Institute for Mathematical Sciences, Muizenberg, Cape Town 7945, South Africa.,Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, IDM, University of Cape Town, Cape Town 7925, South Africa
| | - Jacquiline Mugo
- African Institute for Mathematical Sciences, Muizenberg, Cape Town 7945, South Africa
| | - Nicola J Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, IDM, University of Cape Town, Cape Town 7925, South Africa
| | - Ambroise Wonkam
- Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town 7925, South Africa
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town 7925, South Africa
| | - Gaston K Mazandu
- African Institute for Mathematical Sciences, Muizenberg, Cape Town 7945, South Africa.,Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, IDM, University of Cape Town, Cape Town 7925, South Africa.,Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town 7925, South Africa
| |
Collapse
|
6
|
Salter-Townshend M, Myers S. Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups. Genetics 2019; 212:869-889. [PMID: 31123038 PMCID: PMC6614886 DOI: 10.1534/genetics.119.302139] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 05/18/2019] [Indexed: 12/31/2022] Open
Abstract
We present an algorithm for inferring ancestry segments and characterizing admixture events, which involve an arbitrary number of genetically differentiated groups coming together. This allows inference of the demographic history of the species, properties of admixing groups, identification of signatures of natural selection, and may aid disease gene mapping. The algorithm employs nested hidden Markov models to obtain local ancestry estimation along the genome for each admixed individual. In a range of simulations, the accuracy of these estimates equals or exceeds leading existing methods. Moreover, and unlike these approaches, we do not require any prior knowledge of the relationship between subgroups of donor reference haplotypes and the unseen mixing ancestral populations. Our approach infers these in terms of conditional "copying probabilities." In application to the Human Genome Diversity Project, we corroborate many previously inferred admixture events (e.g., an ancient admixture event in the Kalash). We further identify novel events such as complex four-way admixture in San-Khomani individuals, and show that Eastern European populations possess [Formula: see text] ancestry from a group resembling modern-day central Asians. We also identify evidence of recent natural selection favoring sub-Saharan ancestry at the human leukocyte antigen (HLA) region, across North African individuals. We make available an R and C++ software library, which we term MOSAIC (which stands for MOSAIC Organizes Segments of Ancestry In Chromosomes).
Collapse
Affiliation(s)
| | - Simon Myers
- Dept. of Statistics, University of Oxford and Wellcome Trust Centre for Human Genetics, Oxford, UK
| |
Collapse
|
7
|
Qin H, Zhao J, Zhu X. Identifying Rare Variant Associations in Admixed Populations. Sci Rep 2019; 9:5458. [PMID: 30931973 PMCID: PMC6443736 DOI: 10.1038/s41598-019-41845-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 03/12/2019] [Indexed: 12/27/2022] Open
Abstract
An admixed population and its ancestral populations bear different burdens of a complex disease. The ancestral populations may have different haplotypes of deleterious alleles and thus ancestry-gene interaction can influence disease risk in the admixed population. Among admixed individuals, deleterious haplotypes and their ancestries are dependent and can provide non-redundant association information. Herein we propose a local ancestry boosted sum test (LABST) for identifying chromosomal blocks that harbor rare variants but have no ancestry switches. For such a stable ancestral block, our LABST exploits ancestry-gene interaction and the number of rare alleles therein. Under the null of no genetic association, the test statistic asymptotically follows a chi-square distribution with one degree of freedom (1-df). Our LABST properly controlled type I error rates under extensive simulations, suggesting that the asymptotic approximation was accurate for the null distribution of the test statistic. In terms of power for identifying rare variant associations, our LABST uniformly outperformed several famed methods under four important modes of disease genetics over a large range of relative risks. In conclusion, exploiting ancestry-gene interaction can boost statistical power for rare variant association mapping in admixed populations.
Collapse
Affiliation(s)
- Huaizhen Qin
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, 32611, USA
- Department of Global Biostatistics and Data Science, Tulane University School of Public Health and Tropical Medicine, 1440 Canal Street, New Orleans, LA, 70112, USA
| | - Jinying Zhao
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, 32611, USA
| | - Xiaofeng Zhu
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, 10900 Euclid Avenue, Cleveland, Ohio, 44106, USA.
| |
Collapse
|
8
|
Janzen GM, Wang L, Hufford MB. The extent of adaptive wild introgression in crops. THE NEW PHYTOLOGIST 2019; 221:1279-1288. [PMID: 30368812 DOI: 10.1111/nph.15457] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Accepted: 08/24/2018] [Indexed: 05/05/2023]
Abstract
The study of crop evolution has focused primarily on the process of initial domestication. Post-domestication adaptation during the expansion of crops from their centers of origin has received considerably less attention. Recent research has revealed that, in at least some instances, crops have received introgression from their wild relatives that has facilitated adaptation to novel conditions encountered during expansion. Such adaptive introgression could have an important impact on the basic study of domestication, affecting estimates of several evolutionary processes of interest (e.g. the strength of the domestication bottleneck, the timing of domestication, the targets of selection during domestication). Identification of haplotypes introgressed from the wild may also help in the identification of alleles that are beneficial under particular environmental conditions. Here we review mounting evidence for substantial adaptive wild introgression in several crops and consider the implications of such gene flow to our understanding of crop histories.
Collapse
Affiliation(s)
- Garrett M Janzen
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Li Wang
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
9
|
Chimusa ER, Defo J, Thami PK, Awany D, Mulisa DD, Allali I, Ghazal H, Moussa A, Mazandu GK. Dating admixture events is unsolved problem in multi-way admixed populations. Brief Bioinform 2018; 21:144-155. [PMID: 30462157 DOI: 10.1093/bib/bby112] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 10/12/2018] [Accepted: 10/15/2018] [Indexed: 12/12/2022] Open
Abstract
Advances in human sequencing technologies, coupled with statistical and computational tools, have fostered the development of methods for dating admixture events. These methods have merits and drawbacks in estimating admixture events in multi-way admixed populations. Here, we first provide a comprehensive review and comparison of current methods pertinent to dating admixture events. Second, we assess various admixture dating tools. We do so by performing various simulations. Third, we apply the top two assessed methods to real data of a uniquely admixed population from South Africa. Results reveal that current dating admixture models are not sufficiently equipped to estimate ancient admixtures events and to identify multi-faceted admixture events in complex multi-way admixed populations. We conclude with a discussion of research areas where further work on dating admixture-based methods is needed.
Collapse
Affiliation(s)
- Emile R Chimusa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Joel Defo
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Prisca K Thami
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana.,Department of Biological Sciences, University of Botswana, Gaborone, Botswana
| | - Denis Awany
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Delesa D Mulisa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | - Imane Allali
- Division of Computational Biology, Department of Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa
| | | | - Ahmed Moussa
- Abdelmalek Essaadi University ENSA, Tangier, Morocco
| | - Gaston K Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,Division of Computational Biology, Department of Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine,Faculty of Health Sciences, University of Cape Town, Observatory, Cape Town, South Africa.,African Institute for Mathematical Sciences (AIMS),Muizenberg, Cape Town, South Africa
| |
Collapse
|
10
|
Chang X, Pellegrino R, Garifallou J, March M, Snyder J, Mentch F, Li J, Hou C, Liu Y, Sleiman PMA, Hakonarson H. Common variants at 5q33.1 predispose to migraine in African-American children. J Med Genet 2018; 55:831-836. [PMID: 30266756 DOI: 10.1136/jmedgenet-2018-105359] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 08/16/2018] [Accepted: 08/29/2018] [Indexed: 02/06/2023]
Abstract
BACKGROUND Genome-wide association studies (GWASs) have identified multiple susceptibility loci for migraine in European adults. However, no large-scale genetic studies have been performed in children or African Americans with migraine. METHODS We conducted a GWAS of 380 African-American children and 2129 ancestry-matched controls to identify variants associated with migraine. We then attempted to replicate our primary analysis in an independent cohort of 233 African-American patients and 4038 non-migraine control subjects. RESULTS The results of this study indicate that common variants at 5q33.1 associated with migraine risk in African-American children (rs72793414, p=1.94×10-9). The association was validated in an independent study (p=3.87×10-3) for an overall meta-analysis p value of 3.81×10-10. eQTL (Expression quantitative trait loci) analysis of the Genotype-Tissue Expression data also shows the genotypes of rs72793414 were strongly correlated with the mRNA expression levels of NMUR2 at 5q33.1. NMUR2 encodes a G protein-coupled receptor of neuromedin-U (NMU). NMU, a highly conserved neuropeptide, participates in diverse physiological processes of the central nervous system. CONCLUSIONS This study provides new insights into the genetic basis of childhood migraine and allow for precision therapeutic development strategies targeting migraine patients of African-American ancestry.
Collapse
Affiliation(s)
- Xiao Chang
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Renata Pellegrino
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - James Garifallou
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Michael March
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - James Snyder
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Frank Mentch
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Jin Li
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Affiliated Cancer Hospital & Institute, Guangzhou Medical University, Guangzhou, China
| | - Cuiping Hou
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Yichuan Liu
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Patrick M A Sleiman
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Hakon Hakonarson
- The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.,Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| |
Collapse
|
11
|
Wangkumhang P, Hellenthal G. Statistical methods for detecting admixture. Curr Opin Genet Dev 2018; 53:121-127. [PMID: 30245220 DOI: 10.1016/j.gde.2018.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 08/03/2018] [Accepted: 08/09/2018] [Indexed: 10/28/2022]
Abstract
The increasing availability of large-scale autosomal genetic variation data sampled from world-wide geographic areas, coupled with advances in the statistical methodology to analyse these data, is showcasing the power of DNA as a major tool to gain insights into the demographic history of humans and other organisms. Here we review statistical techniques that shed light on a specific aspect of demography: the detection and description of admixture events where two or more genetically distinct groups intermixed at one or more times in the past. In particular we give an overview of some of the widely used methods to identify and describe admixture events using autosomal DNA from unrelated individuals, with a particular focus on analysing biallelic Single-Nucleotide-Polymorphsim (SNP) markers.
Collapse
Affiliation(s)
- Pongsakorn Wangkumhang
- University College London Genetics Institute (UGI), Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Garrett Hellenthal
- University College London Genetics Institute (UGI), Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.
| |
Collapse
|
12
|
|
13
|
Ravinet M, Faria R, Butlin RK, Galindo J, Bierne N, Rafajlović M, Noor MAF, Mehlig B, Westram AM. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. J Evol Biol 2017; 30:1450-1477. [DOI: 10.1111/jeb.13047] [Citation(s) in RCA: 306] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 01/31/2017] [Accepted: 02/01/2017] [Indexed: 12/14/2022]
Affiliation(s)
- M. Ravinet
- Centre for Ecological and Evolutionary Synthesis; University of Oslo; Oslo Norway
- National Institute of Genetics; Mishima Shizuoka Japan
| | - R. Faria
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos; InBIO, Laboratório Associado; Universidade do Porto; Vairão Portugal
- Department of Experimental and Health Sciences; IBE, Institute of Evolutionary Biology (CSIC-UPF); Pompeu Fabra University; Barcelona Spain
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
| | - R. K. Butlin
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
- Department of Marine Sciences; Centre for Marine Evolutionary Biology; University of Gothenburg; Gothenburg Sweden
| | - J. Galindo
- Department of Biochemistry, Genetics and Immunology; University of Vigo; Vigo Spain
| | - N. Bierne
- CNRS; Université Montpellier; ISEM; Station Marine Sète France
| | - M. Rafajlović
- Department of Physics; University of Gothenburg; Gothenburg Sweden
| | | | - B. Mehlig
- Department of Physics; University of Gothenburg; Gothenburg Sweden
| | - A. M. Westram
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
| |
Collapse
|
14
|
Xue J, Lencz T, Darvasi A, Pe’er I, Carmi S. The time and place of European admixture in Ashkenazi Jewish history. PLoS Genet 2017; 13:e1006644. [PMID: 28376121 PMCID: PMC5380316 DOI: 10.1371/journal.pgen.1006644] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 02/18/2017] [Indexed: 12/21/2022] Open
Abstract
The Ashkenazi Jewish (AJ) population is important in genetics due to its high rate of Mendelian disorders. AJ appeared in Europe in the 10th century, and their ancestry is thought to comprise European (EU) and Middle-Eastern (ME) components. However, both the time and place of admixture are subject to debate. Here, we attempt to characterize the AJ admixture history using a careful application of new and existing methods on a large AJ sample. Our main approach was based on local ancestry inference, in which we first classified each AJ genomic segment as EU or ME, and then compared allele frequencies along the EU segments to those of different EU populations. The contribution of each EU source was also estimated using GLOBETROTTER and haplotype sharing. The time of admixture was inferred based on multiple statistics, including ME segment lengths, the total EU ancestry per chromosome, and the correlation of ancestries along the chromosome. The major source of EU ancestry in AJ was found to be Southern Europe (≈60–80% of EU ancestry), with the rest being likely Eastern European. The inferred admixture time was ≈30 generations ago, but multiple lines of evidence suggest that it represents an average over two or more events, pre- and post-dating the founder event experienced by AJ in late medieval times. The time of the pre-bottleneck admixture event, which was likely Southern European, was estimated to ≈25–50 generations ago. The Ashkenazi Jewish population has resided in Europe for much of its 1000-year existence. However, its ethnic and geographic origins are controversial, due to the scarcity of reliable historical records. Previous genetic studies have found links to Middle-Eastern and European ancestries, but the admixture history has not been studied in detail yet, partly due to technical difficulties in disentangling signals from multiple admixture events. Here, we present an in-depth analysis of the sources of European gene flow and the time of admixture events by using multiple new and existing methods and extensive simulations. Our results suggest a model of at least two events of European admixture. One event slightly pre-dated a late medieval founder event and was likely from a Southern European source. Another event post-dated the founder event and likely occurred in Eastern Europe. These results, as well as the methods introduced, will be highly valuable for geneticists and other researchers interested in Ashkenazi Jewish origins.
Collapse
Affiliation(s)
- James Xue
- Department of Computer Science, Columbia University, New York, New York, United States of America
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Todd Lencz
- Center for Psychiatric Neuroscience, The Feinstein Institute for Medical Research, North Shore-Long Island Jewish Health System, Manhasset, New York, United States of America
- Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of the North Shore–Long Island Jewish Health System, Glen Oaks, New York, United States of America
- Departments of Psychiatry and Molecular Medicine, Hofstra Northwell School of Medicine, Hempstead, New York, United States of America
| | - Ariel Darvasi
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Itsik Pe’er
- Department of Computer Science, Columbia University, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Ein Kerem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
15
|
Massey SE. Strong Amerindian Mitonuclear Discordance in Puerto Rican Genomes Suggests Amerindian Mitochondrial Benefit. Ann Hum Genet 2017; 81:59-77. [DOI: 10.1111/ahg.12185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 01/06/2017] [Indexed: 12/24/2022]
Affiliation(s)
- Steven E. Massey
- Biology Department; University of Puerto Rico - Rio Piedras; PO Box 23360 San Juan Puerto Rico 00931
| |
Collapse
|
16
|
Vandenplas J, Calus MPL, Sevillano CA, Windig JJ, Bastiaansen JWM. Assigning breed origin to alleles in crossbred animals. Genet Sel Evol 2016; 48:61. [PMID: 27549177 PMCID: PMC4994281 DOI: 10.1186/s12711-016-0240-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 08/10/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND For some species, animal production systems are based on the use of crossbreeding to take advantage of the increased performance of crossbred compared to purebred animals. Effects of single nucleotide polymorphisms (SNPs) may differ between purebred and crossbred animals for several reasons: (1) differences in linkage disequilibrium between SNP alleles and a quantitative trait locus; (2) differences in genetic backgrounds (e.g., dominance and epistatic interactions); and (3) differences in environmental conditions, which result in genotype-by-environment interactions. Thus, SNP effects may be breed-specific, which has led to the development of genomic evaluations for crossbred performance that take such effects into account. However, to estimate breed-specific effects, it is necessary to know breed origin of alleles in crossbred animals. Therefore, our aim was to develop an approach for assigning breed origin to alleles of crossbred animals (termed BOA) without information on pedigree and to study its accuracy by considering various factors, including distance between breeds. RESULTS The BOA approach consists of: (1) phasing genotypes of purebred and crossbred animals; (2) assigning breed origin to phased haplotypes; and (3) assigning breed origin to alleles of crossbred animals based on a library of assigned haplotypes, the breed composition of crossbred animals, and their SNP genotypes. The accuracy of allele assignments was determined for simulated datasets that include crosses between closely-related, distantly-related and unrelated breeds. Across these scenarios, the percentage of alleles of a crossbred animal that were correctly assigned to their breed origin was greater than 90 %, and increased with increasing distance between breeds, while the percentage of incorrectly assigned alleles was always less than 2 %. For the remaining alleles, i.e. 0 to 10 % of all alleles of a crossbred animal, breed origin could not be assigned. CONCLUSIONS The BOA approach accurately assigns breed origin to alleles of crossbred animals, even if their pedigree is not recorded.
Collapse
Affiliation(s)
- Jérémie Vandenplas
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 6700 AH, Wageningen, The Netherlands.
| | - Mario P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 6700 AH, Wageningen, The Netherlands
| | - Claudia A Sevillano
- Topigs Norsvin Research Center B.V., 6640 AA, Beuningen, The Netherlands.,Animal Breeding and Genomics Centre, Wageningen University, 6700 AH, Wageningen, The Netherlands
| | - Jack J Windig
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, 6700 AH, Wageningen, The Netherlands
| | - John W M Bastiaansen
- Animal Breeding and Genomics Centre, Wageningen University, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
17
|
Abstract
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
Collapse
|
18
|
Morozova I, Flegontov P, Mikheyev AS, Bruskin S, Asgharian H, Ponomarenko P, Klyuchnikov V, ArunKumar G, Prokhortchouk E, Gankin Y, Rogaev E, Nikolsky Y, Baranova A, Elhaik E, Tatarinova TV. Toward high-resolution population genomics using archaeological samples. DNA Res 2016; 23:295-310. [PMID: 27436340 PMCID: PMC4991838 DOI: 10.1093/dnares/dsw029] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2015] [Accepted: 05/22/2016] [Indexed: 12/30/2022] Open
Abstract
The term ‘ancient DNA’ (aDNA) is coming of age, with over 1,200 hits in the PubMed database, beginning in the early 1980s with the studies of ‘molecular paleontology’. Rooted in cloning and limited sequencing of DNA from ancient remains during the pre-PCR era, the field has made incredible progress since the introduction of PCR and next-generation sequencing. Over the last decade, aDNA analysis ushered in a new era in genomics and became the method of choice for reconstructing the history of organisms, their biogeography, and migration routes, with applications in evolutionary biology, population genetics, archaeogenetics, paleo-epidemiology, and many other areas. This change was brought by development of new strategies for coping with the challenges in studying aDNA due to damage and fragmentation, scarce samples, significant historical gaps, and limited applicability of population genetics methods. In this review, we describe the state-of-the-art achievements in aDNA studies, with particular focus on human evolution and demographic history. We present the current experimental and theoretical procedures for handling and analysing highly degraded aDNA. We also review the challenges in the rapidly growing field of ancient epigenomics. Advancement of aDNA tools and methods signifies a new era in population genetics and evolutionary medicine research.
Collapse
Affiliation(s)
- Irina Morozova
- Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland
| | - Pavel Flegontov
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic Bioinformatics Center, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russian Federation
| | - Alexander S Mikheyev
- Ecology and Evolution Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Sergey Bruskin
- Vavilov Institute of General Genetics RAS, Moscow, Russia
| | - Hosseinali Asgharian
- Department of Computational and Molecular Biology, University of Southern California, Los Angeles, CA, USA
| | - Petr Ponomarenko
- Center for Personalized Medicine, Children's Hospital Los Angeles, Los Angeles, CA, USA Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA
| | | | | | - Egor Prokhortchouk
- Research Center of Biotechnology RAS, Moscow, Russia Department of Biology, Lomonosov Moscow State University, Russia
| | | | - Evgeny Rogaev
- Vavilov Institute of General Genetics RAS, Moscow, Russia University of Massachusetts Medical School, Worcester, MA, USA
| | - Yuri Nikolsky
- Vavilov Institute of General Genetics RAS, Moscow, Russia F1 Genomics, San Diego, CA, USA School of Systems Biology, George Mason University, VA, USA
| | - Ancha Baranova
- School of Systems Biology, George Mason University, VA, USA Research Centre for Medical Genetics, Moscow, Russia Atlas Biomed Group, Moscow, Russia
| | - Eran Elhaik
- Department of Animal & Plant Sciences, University of Sheffield, Sheffield, South Yorkshire, UK
| | - Tatiana V Tatarinova
- Bioinformatics Center, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russian Federation Center for Personalized Medicine, Children's Hospital Los Angeles, Los Angeles, CA, USA Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
19
|
Zhou Q, Zhao L, Guan Y. Strong Selection at MHC in Mexicans since Admixture. PLoS Genet 2016; 12:e1005847. [PMID: 26863142 PMCID: PMC4749250 DOI: 10.1371/journal.pgen.1005847] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 01/14/2016] [Indexed: 11/19/2022] Open
Abstract
Mexicans are a recent admixture of Amerindians, Europeans, and Africans. We performed local ancestry analysis of Mexican samples from two genome-wide association studies obtained from dbGaP, and discovered that at the MHC region Mexicans have excessive African ancestral alleles compared to the rest of the genome, which is the hallmark of recent selection for admixed samples. The estimated selection coefficients are 0.05 and 0.07 for two datasets, which put our finding among the strongest known selections observed in humans, namely, lactase selection in northern Europeans and sickle-cell trait in Africans. Using inaccurate Amerindian training samples was a major concern for the credibility of previously reported selection signals in Latinos. Taking advantage of the flexibility of our statistical model, we devised a model fitting technique that can learn Amerindian ancestral haplotype from the admixed samples, which allows us to infer local ancestries for Mexicans using only European and African training samples. The strong selection signal at the MHC remains without Amerindian training samples. Finally, we note that medical history studies suggest such a strong selection at MHC is plausible in Mexicans.
Collapse
Affiliation(s)
- Quan Zhou
- USDA/ARS Children’s Nutrition Research Center, Houston, Texas, United States of America
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Program of Structure and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Liang Zhao
- USDA/ARS Children’s Nutrition Research Center, Houston, Texas, United States of America
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Yongtao Guan
- USDA/ARS Children’s Nutrition Research Center, Houston, Texas, United States of America
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Program of Structure and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
20
|
Zheng X, Weir BS. Eigenanalysis of SNP data with an identity by descent interpretation. Theor Popul Biol 2015; 107:65-76. [PMID: 26482676 DOI: 10.1016/j.tpb.2015.09.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Revised: 09/17/2015] [Accepted: 09/23/2015] [Indexed: 01/11/2023]
Abstract
Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis "EIGMIX" is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE. In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.
Collapse
Affiliation(s)
- Xiuwen Zheng
- Department of Biostatistics, University of Washington, Box 359461, Seattle, WA 98195-9461, USA.
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Box 359461, Seattle, WA 98195-9461, USA.
| |
Collapse
|
21
|
Mersha TB. Mapping asthma-associated variants in admixed populations. Front Genet 2015; 6:292. [PMID: 26483834 PMCID: PMC4586512 DOI: 10.3389/fgene.2015.00292] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 09/03/2015] [Indexed: 12/19/2022] Open
Abstract
Admixed populations arise when two or more previously isolated populations interbreed. Mapping asthma susceptibility loci in an admixed population using admixture mapping (AM) involves screening the genome of individuals of mixed ancestry for chromosomal regions that have a higher frequency of alleles from a parental population with higher asthma risk as compared with parental population with lower asthma risk. AM takes advantage of the admixture created in populations of mixed ancestry to identify genomic regions where an association exists between genetic ancestry and asthma (in contrast to between the genotype of the marker and asthma). The theory behind AM is that chromosomal segments of affected individuals contain a significantly higher-than-average proportion of alleles from the high-risk parental population and thus are more likely to harbor disease-associated loci. Criteria to evaluate the applicability of AM as a gene mapping approach include: (1) the prevalence of the disease differences in ancestral populations from which the admixed population was formed; (2) a measurable difference in disease-causing alleles between the parental populations; (3) reduced linkage disequilibrium (LD) between unlinked loci across chromosomes and strong LD between neighboring loci; (4) a set of markers with noticeable allele-frequency differences between parental populations that contributes to the admixed population (single nucleotide polymorphisms (SNPs) are the markers of choice because they are abundant, stable, relatively cheap to genotype, and informative with regard to the LD structure of chromosomal segments); and (5) there is an understanding of the extent of segmental chromosomal admixtures and their interactions with environmental factors. Although genome-wide association studies have contributed greatly to our understanding of the genetic components of asthma, the large and increasing degree of admixture in populations across the world create many challenges for further efforts to map disease-causing genes. This review, summarizes the historical context of admixed populations and AM, and considers current opportunities to use AM to map asthma genes. In addition, we provide an overview of the potential limitations and future directions of AM in biomedical research, including joint admixture and association mapping for asthma and asthma-related disorders.
Collapse
Affiliation(s)
- Tesfaye B Mersha
- Division of Asthma Research, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati Cincinnati, OH, USA
| |
Collapse
|
22
|
Shriner D. Mixed Ancestry and Disease Risk Transferability. CURRENT GENETIC MEDICINE REPORTS 2015. [DOI: 10.1007/s40142-015-0080-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
23
|
Kozlov K, Chebotarev D, Hassan M, Triska M, Triska P, Flegontov P, Tatarinova TV. Differential Evolution approach to detect recent admixture. BMC Genomics 2015; 16 Suppl 8:S9. [PMID: 26111206 PMCID: PMC4480842 DOI: 10.1186/1471-2164-16-s8-s9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The genetic structure of human populations is extraordinarily complex and of fundamental importance to studies of anthropology, evolution, and medicine. As increasingly many individuals are of mixed origin, there is an unmet need for tools that can infer multiple origins. Misclassification of such individuals can lead to incorrect and costly misinterpretations of genomic data, primarily in disease studies and drug trials. We present an advanced tool to infer ancestry that can identify the biogeographic origins of highly mixed individuals. reAdmix can incorporate individual's knowledge of ancestors (e.g. having some ancestors from Turkey or a Scottish grandmother). reAdmix is an online tool available at http://chcb.saban-chla.usc.edu/reAdmix/.
Collapse
|
24
|
Leveraging ancestry to improve causal variant identification in exome sequencing for monogenic disorders. Eur J Hum Genet 2015; 24:113-9. [PMID: 25898925 DOI: 10.1038/ejhg.2015.68] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2014] [Revised: 03/01/2015] [Accepted: 03/10/2015] [Indexed: 01/18/2023] Open
Abstract
Recent breakthroughs in exome-sequencing technology have made possible the identification of many causal variants of monogenic disorders. Although extremely powerful when closely related individuals (eg, child and parents) are simultaneously sequenced, sequencing of a single case is often unsuccessful due to the large number of variants that need to be followed up for functional validation. Many approaches filter out common variants above a given frequency threshold (eg, 1%), and then prioritize the remaining variants according to their functional, structural and conservation properties. Here we present methods that leverage the genetic structure across different populations to improve filtering performance while accounting for the finite sample size of the reference panels. We show that leveraging genetic structure reduces the number of variants that need to be followed up by 16% in simulations and by up to 38% in empirical data of 20 exomes from individuals with monogenic disorders for which the causal variants are known.
Collapse
|
25
|
Genetic structure characterization of Chileans reflects historical immigration patterns. Nat Commun 2015; 6:6472. [PMID: 25778948 PMCID: PMC4382693 DOI: 10.1038/ncomms7472] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 01/30/2015] [Indexed: 12/25/2022] Open
Abstract
Identifying the ancestral components of genomes of admixed individuals helps uncovering the genetic basis of diseases and understanding the demographic history of populations. We estimate local ancestry on 313 Chileans and assess the contribution from three continental populations. The distribution of ancestry block-length suggests an average admixing time around 10 generations ago. Sex-chromosome analyses confirm imbalanced contribution of European men and Native-American women. Previously known genes under selection contain SNPs showing large difference in allele frequencies. Furthermore, we show that assessing ancestry is harder at SNPs with higher recombination rates and easier at SNPs with large difference in allele frequencies at the ancestral populations. Two observations, that African ancestry proportions systematically decrease from North to South, and that European ancestry proportions are highest in central regions, show that the genetic structure of Chileans is under the influence of a diffusion process leading to an ancestry gradient related to geography. Chileans are genetically admixed. Here, the authors find that the average admixing time is around 10 generations ago and show the contribution of European men and Native-American women to the Chilean population.
Collapse
|
26
|
Johnson RC, Nelson GW, Zagury JF, Winkler CA. ALDsuite: Dense marker MALD using principal components of ancestral linkage disequilibrium. BMC Genet 2015; 16:23. [PMID: 25886794 PMCID: PMC4408589 DOI: 10.1186/s12863-015-0179-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 02/06/2015] [Indexed: 01/04/2023] Open
Abstract
Background Mapping by admixture linkage disequilibrium (MALD) is a whole genome gene mapping method that uses LD from extended blocks of ancestry inherited from parental populations among admixed individuals to map associations for diseases, that vary in prevalence among human populations. The extended LD queried for marker association with ancestry results in a greatly reduced number of comparisons compared to standard genome wide association studies. As ancestral population LD tends to confound the analysis of admixture LD, the earliest algorithms for MALD required marker sets sufficiently sparse to lack significant ancestral LD between markers. However current genotyping technologies routinely provide dense SNP data, which convey more information than sparse sets, if this information can be efficiently used. There are currently no software solutions that offer both local ancestry inference using dense marker data and disease association statistics. Results We present here an R package, ALDsuite, which accounts for local LD using principal components of haplotypes from surrogate ancestral population data, and includes tools for quality control of data, MALD, downstream analysis of results and visualization graphics. Conclusions ALDsuite offers a fast, accurate estimation of global and local ancestry and comes bundled with the tools needed for MALD, from data quality control through mapping of and visualization of disease genes.
Collapse
Affiliation(s)
- Randall C Johnson
- BSP CCR Genetics Core, Leidos Biomedical Research, Inc, Frederick National Laboratory, Frederick, MD, 21702, USA. .,Chaire de Bioinformatique, Conservatiore National des Arts et Metieèrs, Paris, 75003, France.
| | - George W Nelson
- BSP CCR Genetics Core, Leidos Biomedical Research, Inc, Frederick National Laboratory, Frederick, MD, 21702, USA.
| | - Jean-Francois Zagury
- Chaire de Bioinformatique, Conservatiore National des Arts et Metieèrs, Paris, 75003, France.
| | - Cheryl A Winkler
- Basic Research Laboratory, Leidos Biomedical Research, Inc, Frederick National Laboratory, Frederick, MD, 21702, USA.
| |
Collapse
|
27
|
Zhang X, Mu W, Liu C, Zhang W. Ancestry-informative markers for African Americans based on the Affymetrix Pan-African genotyping array. PeerJ 2014; 2:e660. [PMID: 25392759 PMCID: PMC4226639 DOI: 10.7717/peerj.660] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 10/18/2014] [Indexed: 12/20/2022] Open
Abstract
Genetic admixture has been utilized as a tool for identifying loci associated with complex traits and diseases in recently admixed populations such as African Americans. In particular, admixture mapping is an efficient approach to identifying genetic basis for those complex diseases with substantial racial or ethnic disparities. Though current advances in admixture mapping algorithms may utilize the entire panel of SNPs, providing ancestry-informative markers (AIMs) that can differentiate parental populations and estimate ancestry proportions in an admixed population may particularly benefit admixture mapping in studies of limited samples, help identify unsuitable individuals (e.g., through genotyping the most informative ancestry markers) before starting large genome-wide association studies (GWAS), or guide larger scale targeted deep re-sequencing for determining specific disease-causing variants. Defining panels of AIMs based on commercial, high-throughput genotyping platforms will facilitate the utilization of these platforms for simultaneous admixture mapping of complex traits and diseases, in addition to conventional GWAS. Here, we describe AIMs detected based on the Shannon Information Content (SIC) or Fst for African Americans with genome-wide coverage that were selected from ∼2.3 million single nucleotide polymorphisms (SNPs) covered by the Affymetrix Axiom Pan-African array, a newly developed genotyping platform optimized for individuals of African ancestry.
Collapse
Affiliation(s)
- Xu Zhang
- The Affiliated Hospital of Medical School, Ningbo University , Ningbo, Zhejiang Province , China ; Section of Hematology/Oncology, Department of Medicine, University of Illinois , Chicago, IL , USA
| | - Wenbo Mu
- Department of Pediatrics, University of Illinois , Chicago, IL , USA
| | - Cong Liu
- Department of Pediatrics, University of Illinois , Chicago, IL , USA
| | - Wei Zhang
- The Affiliated Hospital of Medical School, Ningbo University , Ningbo, Zhejiang Province , China ; Department of Pediatrics, University of Illinois , Chicago, IL , USA
| |
Collapse
|
28
|
Thornton TA, Bermejo JL. Local and global ancestry inference and applications to genetic association analysis for admixed populations. Genet Epidemiol 2014; 38 Suppl 1:S5-S12. [PMID: 25112189 DOI: 10.1002/gepi.21819] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Genetic association studies in recently admixed populations offer exciting opportunities to identify novel variants underlying phenotypic diversity. At the same time, genetic heterogeneity resulting from population admixture has to be accounted for to ensure validity of association tests. The whole-genome sequence data and the genome-wide single-nucleotide polymorphism chip data for Mexican American individuals provided by Genetic Analysis Workshop 18 (GAW18) presents a unique opportunity to evaluate and compare methods for the statistical analysis of admixed genetic data. We summarize here the five contributions from the GAW18 working group on admixture mapping and adjusting for admixture. Although group members considered a variety of research topics, the general theme was inference and consideration of ancestry admixture in genetic analyses. The topics considered can be grouped into three categories: (1) global and local ancestry inference and estimation, (2) association and admixture mapping, and (3) genotype imputation in admixed samples. We describe the approaches that were used and the most relevant findings from each contribution. We also provide insight into the strengths and limitations of the state-of-the-art methods considered for genetic analyses in admixed populations.
Collapse
Affiliation(s)
- Timothy A Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | | |
Collapse
|
29
|
Bhatia G, Tandon A, Patterson N, Aldrich MC, Ambrosone CB, Amos C, Bandera EV, Berndt SI, Bernstein L, Blot WJ, Bock CH, Caporaso N, Casey G, Deming SL, Diver WR, Gapstur SM, Gillanders EM, Harris CC, Henderson BE, Ingles SA, Isaacs W, De Jager PL, John EM, Kittles RA, Larkin E, McNeill LH, Millikan RC, Murphy A, Neslund-Dudas C, Nyante S, Press MF, Rodriguez-Gil JL, Rybicki BA, Schwartz AG, Signorello LB, Spitz M, Strom SS, Tucker MA, Wiencke JK, Witte JS, Wu X, Yamamura Y, Zanetti KA, Zheng W, Ziegler RG, Chanock SJ, Haiman CA, Reich D, Price AL. Genome-wide scan of 29,141 African Americans finds no evidence of directional selection since admixture. Am J Hum Genet 2014; 95:437-44. [PMID: 25242497 PMCID: PMC4185117 DOI: 10.1016/j.ajhg.2014.08.011] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Accepted: 08/22/2014] [Indexed: 10/24/2022] Open
Abstract
The extent of recent selection in admixed populations is currently an unresolved question. We scanned the genomes of 29,141 African Americans and failed to find any genome-wide-significant deviations in local ancestry, indicating no evidence of selection influencing ancestry after admixture. A recent analysis of data from 1,890 African Americans reported that there was evidence of selection in African Americans after their ancestors left Africa, both before and after admixture. Selection after admixture was reported on the basis of deviations in local ancestry, and selection before admixture was reported on the basis of allele-frequency differences between African Americans and African populations. The local-ancestry deviations reported by the previous study did not replicate in our very large sample, and we show that such deviations were expected purely by chance, given the number of hypotheses tested. We further show that the previous study's conclusion of selection in African Americans before admixture is also subject to doubt. This is because the FST statistics they used were inflated and because true signals of unusual allele-frequency differences between African Americans and African populations would be best explained by selection that occurred in Africa prior to migration to the Americas.
Collapse
Affiliation(s)
- Gaurav Bhatia
- Division of Health, Science, and Technology, the Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA.
| | - Arti Tandon
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA; Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Nick Patterson
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - Melinda C Aldrich
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA; Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA; Department of Thoracic Surgery, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - Christine B Ambrosone
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| | - Christopher Amos
- Section of Biostatistics and Epidemiology, Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Hanover, NH 03766, USA
| | - Elisa V Bandera
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08903, USA
| | - Sonja I Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Leslie Bernstein
- Division of Cancer Etiology, Department of Population Sciences, Beckman Research Institute, City of Hope, CA 91010, USA
| | - William J Blot
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA; Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA; International Epidemiology Institute, Rockville, MD 20850, USA
| | - Cathryn H Bock
- Karmanos Cancer Institute and Department of Oncology, Wayne State University of Medicine, Detroit, MI 48201, USA
| | - Neil Caporaso
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Graham Casey
- Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA
| | - Sandra L Deming
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA; Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - W Ryan Diver
- Epidemiology Research Program, American Cancer Society, Atlanta, GA 30303, USA
| | - Susan M Gapstur
- Epidemiology Research Program, American Cancer Society, Atlanta, GA 30303, USA
| | - Elizabeth M Gillanders
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD 20892, USA
| | - Curtis C Harris
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Brian E Henderson
- Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA
| | - Sue A Ingles
- Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA
| | - William Isaacs
- James Buchanan Brady Urological Institute, Johns Hopkins Hospital and Medical Institutions, Baltimore, MD 21287, USA
| | - Phillip L De Jager
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA; Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA; Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | - Esther M John
- Cancer Prevention Institute of California, Fremont, CA 94538, USA; Stanford Cancer Center, Stanford Medicine, Stanford, CA 94305, USA
| | - Rick A Kittles
- Department of Medicine, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Emma Larkin
- Division of Allergy, Pulmonary, and Critical Care, Department of Medicine, Vanderbilt University Medical Center, 6100 Medical Center East, Nashville, TN 37232-8300, USA
| | - Lorna H McNeill
- Department of Health Disparities Research, Cancer Prevention and Population Sciences, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA; Center for Community Implementation and Dissemination Research, Duncan Family Institute, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Robert C Millikan
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Adam Murphy
- Department of Urology, Northwestern University, Chicago, IL 60611, USA
| | | | - Sarah Nyante
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Michael F Press
- Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA
| | - Jorge L Rodriguez-Gil
- Sylvester Comprehensive Cancer Center and Department of Epidemiology and Public Health, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Benjamin A Rybicki
- Department of Public Health Sciences, Henry Ford Hospital, Detroit, MI 48202, USA
| | - Ann G Schwartz
- Karmanos Cancer Institute and Department of Oncology, Wayne State University of Medicine, Detroit, MI 48201, USA
| | - Lisa B Signorello
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA; Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA; International Epidemiology Institute, Rockville, MD 20850, USA
| | - Margaret Spitz
- Section of Biostatistics and Epidemiology, Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Hanover, NH 03766, USA
| | - Sara S Strom
- Department of Epidemiology, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Margaret A Tucker
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - John K Wiencke
- University of California, San Francisco, San Francisco, CA 94158, USA
| | - John S Witte
- Departments of Epidemiology and Biostatistics and Urology, Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Xifeng Wu
- Section of Biostatistics and Epidemiology, Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Hanover, NH 03766, USA
| | - Yuko Yamamura
- Department of Epidemiology, the University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Krista A Zanetti
- Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD 20892, USA; Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Nashville, TN 37203, USA; Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - Regina G Ziegler
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Christopher A Haiman
- Departments of Preventive Medicine and Pathology, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 90033, USA
| | - David Reich
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA; Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Alkes L Price
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA; Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
30
|
Martin SH, Davey JW, Jiggins CD. Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Mol Biol Evol 2014; 32:244-57. [PMID: 25246699 PMCID: PMC4271521 DOI: 10.1093/molbev/msu269] [Citation(s) in RCA: 391] [Impact Index Per Article: 39.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Several methods have been proposed to test for introgression across genomes. One method tests for a genome-wide excess of shared derived alleles between taxa using Patterson's D statistic, but does not establish which loci show such an excess or whether the excess is due to introgression or ancestral population structure. Several recent studies have extended the use of D by applying the statistic to small genomic regions, rather than genome-wide. Here, we use simulations and whole-genome data from Heliconius butterflies to investigate the behavior of D in small genomic regions. We find that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity. As an alternative, we propose a related statistic ƒ(d), a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture. ƒ(d) is not subject to the same biases as D, and is better at identifying introgressed loci. Finally, we show that both D and ƒ(d) outliers tend to cluster in regions of low absolute divergence (d(XY)), which can confound a recently proposed test for differentiating introgression from shared ancestral variation at individual loci.
Collapse
Affiliation(s)
- Simon H Martin
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - John W Davey
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
31
|
Chen M, Yang C, Li C, Hou L, Chen X, Zhao H. Admixture mapping analysis in the context of GWAS with GAW18 data. BMC Proc 2014; 8:S3. [PMID: 25519317 PMCID: PMC4143627 DOI: 10.1186/1753-6561-8-s1-s3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Admixture mapping is a disease-mapping strategy to identify disease susceptibility variants in an admixed population that is a result of mating between 2 historically separated populations differing in allele frequencies and disease prevalence. With the increasing availability of high-density genotyping data generated in genome-wide association studies, it is of interest to investigate how to apply admixture mapping in the context of the genome-wide association studies and how to adjust for admixture in association tests. In this study, we first evaluated 3 different local ancestry inference methods, LAMP, LAMP-LD, and MULTIMIX. Then we applied admixture mapping analysis based on estimated local ancestry. Finally, we performed association tests with adjustment for local ancestry.
Collapse
Affiliation(s)
- Mengjie Chen
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Can Yang
- Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520, USA
| | - Cong Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Lin Hou
- Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520, USA
| | - Xiaowei Chen
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Hongyu Zhao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA ; Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
32
|
Li J, Lao X, Zhang C, Tian L, Lu D, Xu S. Increased genetic diversity of ADME genes in African Americans compared with their putative ancestral source populations and implications for pharmacogenomics. BMC Genet 2014; 15:52. [PMID: 24884825 PMCID: PMC4021503 DOI: 10.1186/1471-2156-15-52] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Accepted: 04/24/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND African Americans have been treated as a representative population for African ancestry for many purposes, including pharmacogenomic studies. However, the contribution of European ancestry is expected to result in considerable differences in the genetic architecture of African American individuals compared with an African genome. In particular, the genetic admixture influences the genomic diversity of drug metabolism-related genes, and may cause high heterogeneity of drug responses in admixed populations such as African Americans. RESULTS The genomic ancestry information of African-American (ASW) samples was obtained from data of the 1000 Genomes Project, and local ancestral components were also extracted for 32 core genes and 252 extended genes, which are associated with drug absorption, distribution, metabolism, and excretion (ADME) genes. As expected, the global genetic diversity pattern in ASW was determined by the contributions of its putative ancestral source populations, and the whole profiles of ADME genes in ASW are much closer to those in YRI than in CEU. However, we observed much higher diversity in some functionally important ADME genes in ASW than either CEU or YRI, which could be a result of either genetic drift or natural selection, and we identified some signatures of the latter. We analyzed the clinically relevant polymorphic alleles and haplotypes, and found that 28 functional mutations (including 3 missense, 3 splice, and 22 regulator sites) exhibited significantly higher differentiation between the three populations. CONCLUSIONS Analysis of the genetic diversity of ADME genes showed differentiation between admixed population and its ancestral source populations. In particular, the different genetic diversity between ASW and YRI indicated that the ethnic differences in pharmacogenomic studies are broadly existed despite that African ancestry is dominant in Africans Americans. This study should advance our understanding of the genetic basis of the drug response heterogeneity between populations, especially in the case of population admixture, and have significant implications for evaluating potential inter-population heterogeneity in drug treatment effects.
Collapse
Affiliation(s)
| | | | | | | | | | - Shuhua Xu
- Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.
| |
Collapse
|
33
|
Brown R, Pasaniuc B. Enhanced methods for local ancestry assignment in sequenced admixed individuals. PLoS Comput Biol 2014; 10:e1003555. [PMID: 24743331 PMCID: PMC3990492 DOI: 10.1371/journal.pcbi.1003555] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2013] [Accepted: 02/10/2014] [Indexed: 01/22/2023] Open
Abstract
Inferring the ancestry at each locus in the genome of recently admixed individuals (e.g., Latino Americans) plays a major role in medical and population genetic inferences, ranging from finding disease-risk loci, to inferring recombination rates, to mapping missing contigs in the human genome. Although many methods for local ancestry inference have been proposed, most are designed for use with genotyping arrays and fail to make use of the full spectrum of data available from sequencing. In addition, current haplotype-based approaches are very computationally demanding, requiring large computational time for moderately large sample sizes. Here we present new methods for local ancestry inference that leverage continent-specific variants (CSVs) to attain increased performance over existing approaches in sequenced admixed genomes. A key feature of our approach is that it incorporates the admixed genomes themselves jointly with public datasets, such as 1000 Genomes, to improve the accuracy of CSV calling. We use simulations to show that our approach attains accuracy similar to widely used computationally intensive haplotype-based approaches with large decreases in runtime. Most importantly, we show that our method recovers comparable local ancestries, as the 1000 Genomes consensus local ancestry calls in the real admixed individuals from the 1000 Genomes Project. We extend our approach to account for low-coverage sequencing and show that accurate local ancestry inference can be attained at low sequencing coverage. Finally, we generalize CSVs to sub-continental population-specific variants (sCSVs) and show that in some cases it is possible to determine the sub-continental ancestry for short chromosomal segments on the basis of sCSVs. Advances in sequencing technologies are dramatically changing the volume and type of data collected in genetic studies. Although most genetic studies so far have focused on individuals of European ancestry, recent studies are increasingly being performed in individuals of admixed ancestry (i.e., with recent ancestors from multiple continents, e.g., Latino Americans). A key component in such studies is the accurate inference of continental ancestry at each segment in the genome of these individuals. In this work we present accurate and robust methods that use continent-specific variants (i.e., genetic variants observed only in individuals of a given continent), now readily accessible through sequencing technology, to perform extremely fast and accurate inference of the ancestral origin of each genomic segment in recently admixed individuals.
Collapse
Affiliation(s)
- Robert Brown
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Pathology and Laboratory Medicine, Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (RB); (BP)
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Pathology and Laboratory Medicine, Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Jonsson Comprehensive Cancer Center, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (RB); (BP)
| |
Collapse
|
34
|
Abstract
We present a two-layer hidden Markov model to detect the structure of haplotypes for unrelated individuals. This allows us to model two scales of linkage disequilibrium (one within a group of haplotypes and one between groups), thereby taking advantage of rich haplotype information to infer local ancestry of admixed individuals. Our method outperforms competing state-of-the-art methods, particularly for regions of small ancestral track lengths. Applying our method to Mexican samples in HapMap3, we found two regions on chromosomes 6 and 8 that show significant departure of local ancestry from the genome-wide average. A software package implementing the methods described in this article is freely available at http://bcm.edu/cnrc/mcmcmc.
Collapse
|
35
|
Origin of the PSEN1 E280A mutation causing early-onset Alzheimer's disease. Alzheimers Dement 2013; 10:S277-S283.e10. [PMID: 24239249 DOI: 10.1016/j.jalz.2013.09.005] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Revised: 09/12/2013] [Accepted: 09/17/2013] [Indexed: 12/31/2022]
Abstract
BACKGROUND A mutation in presenilin 1 (E280A) causes early-onset Alzheimer's disease. Understanding the origin of this mutation will inform medical genetics. METHODS We sequenced the genomes of 102 individuals from Antioquia, Colombia. We applied identity-by-descent analysis to identify regions of common ancestry. We estimated the age of the E280A mutation and the local ancestry of the haplotype harboring this mutation. RESULTS All affected individuals share a minimal haplotype of 1.8 Mb containing E280A. We estimate a time to most recent common ancestor of E280A of 10 (95% credible interval, 7.2-12.6) generations. We date the de novo mutation event to 15 (95% credible interval, 11-25) generations ago. We infer a western European geographic origin of the shared haplotype. CONCLUSIONS The age and geographic origin of E280A are consistent with a single founder dating from the time of the Spanish Conquistadors who began colonizing Colombia during the early 16th century.
Collapse
|
36
|
Chimusa ER, Zaitlen N, Daya M, Möller M, van Helden PD, Mulder NJ, Price AL, Hoal EG. Genome-wide association study of ancestry-specific TB risk in the South African Coloured population. Hum Mol Genet 2013; 23:796-809. [PMID: 24057671 DOI: 10.1093/hmg/ddt462] [Citation(s) in RCA: 118] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The worldwide burden of tuberculosis (TB) remains an enormous problem, and is particularly severe in the admixed South African Coloured (SAC) population residing in the Western Cape. Despite evidence from twin studies suggesting a strong genetic component to TB resistance, only a few loci have been identified to date. In this work, we conduct a genome-wide association study (GWAS), meta-analysis and trans-ethnic fine mapping to attempt the replication of previously identified TB susceptibility loci. Our GWAS results confirm the WT1 chr11 susceptibility locus (rs2057178: odds ratio = 0.62, P = 2.71e(-06)) previously identified by Thye et al., but fail to replicate previously identified polymorphisms in the TLR8 gene and locus 18q11.2. Our study demonstrates that the genetic contribution to TB risk varies between continental populations, and illustrates the value of including admixed populations in studies of TB risk and other complex phenotypes. Our evaluation of local ancestry based on the real and simulated data demonstrates that case-only admixture mapping is currently impractical in multi-way admixed populations, such as the SAC, due to spurious deviations in average local ancestry generated by current local ancestry inference methods. This study provides insights into identifying disease genes and ancestry-specific disease risk in multi-way admixed populations.
Collapse
Affiliation(s)
- Emile R Chimusa
- Department of Clinical Laboratory Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Chimusa ER, Daya M, Möller M, Ramesar R, Henn BM, van Helden PD, Mulder NJ, Hoal EG. Determining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method. PLoS One 2013; 8:e73971. [PMID: 24066090 PMCID: PMC3774743 DOI: 10.1371/journal.pone.0073971] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 07/25/2013] [Indexed: 02/03/2023] Open
Abstract
Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes. We applied this approach to a complex, uniquely admixed South African population. Using genome-wide SNP data from over 764 individuals, we accurately estimate the genetic contributions from the best ancestral populations: isiXhosa [Formula: see text], ‡Khomani SAN [Formula: see text], European [Formula: see text], Indian [Formula: see text], and Chinese [Formula: see text]. We also demonstrate that the ancestral allele frequency differences correlate with increased linkage disequilibrium in the South African population, which originates from admixture events rather than population bottlenecks. NOMENCLATURE The collective term for people of mixed ancestry in southern Africa is "Coloured," and this is officially recognized in South Africa as a census term, and for self-classification. Whilst we acknowledge that some cultures may use this term in a derogatory manner, these connotations are not present in South Africa, and are certainly not intended here.
Collapse
Affiliation(s)
- Emile R. Chimusa
- Computational Biology Group, Department of Clinical Laboratory Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, Cape Town, South Africa
| | - Michelle Daya
- MRC Centre for Molecular and Cellular Biology, DST/NRF Centre of Excellence for Biomedical TB Research, Division of Molecular Biology and Human Genetics, Faculty of Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Marlo Möller
- MRC Centre for Molecular and Cellular Biology, DST/NRF Centre of Excellence for Biomedical TB Research, Division of Molecular Biology and Human Genetics, Faculty of Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Raj Ramesar
- MRC Human Genetics Research Unit, Division of Human Genetics, Department of Clinical Laboratory Sciences, Institute for Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Brenna M. Henn
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York, United States of America
| | - Paul D. van Helden
- MRC Centre for Molecular and Cellular Biology, DST/NRF Centre of Excellence for Biomedical TB Research, Division of Molecular Biology and Human Genetics, Faculty of Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Nicola J. Mulder
- Computational Biology Group, Department of Clinical Laboratory Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, Cape Town, South Africa
| | - Eileen G. Hoal
- MRC Centre for Molecular and Cellular Biology, DST/NRF Centre of Excellence for Biomedical TB Research, Division of Molecular Biology and Human Genetics, Faculty of Health Sciences, Stellenbosch University, Tygerberg, South Africa
| |
Collapse
|
38
|
Genovese G, Handsaker R, Li H, Kenny E, McCarroll S. Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes. Am J Hum Genet 2013; 93:411-21. [PMID: 23932108 DOI: 10.1016/j.ajhg.2013.07.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Revised: 06/25/2013] [Accepted: 07/01/2013] [Indexed: 01/22/2023] Open
Abstract
A principal obstacle to completing maps and analyses of the human genome involves the genome's "inaccessible" regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)-a substantial fraction of the human genome's remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions.
Collapse
|
39
|
Baran Y, Quintela I, Carracedo Á, Pasaniuc B, Halperin E. Enhanced localization of genetic samples through linkage-disequilibrium correction. Am J Hum Genet 2013; 92:882-94. [PMID: 23726367 DOI: 10.1016/j.ajhg.2013.04.023] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Revised: 02/21/2013] [Accepted: 04/25/2013] [Indexed: 12/21/2022] Open
Abstract
Characterizing the spatial patterns of genetic diversity in human populations has a wide range of applications, from detecting genetic mutations associated with disease to inferring human history. Current approaches, including the widely used principal-component analysis, are not suited for the analysis of linked markers, and local and long-range linkage disequilibrium (LD) can dramatically reduce the accuracy of spatial localization when unaccounted for. To overcome this, we have introduced an approach that performs spatial localization of individuals on the basis of their genetic data and explicitly models LD among markers by using a multivariate normal distribution. By leveraging external reference panels, we derive closed-form solutions to the optimization procedure to achieve a computationally efficient method that can handle large data sets. We validate the method on empirical data from a large sample of European individuals from the POPRES data set, as well as on a large sample of individuals of Spanish ancestry. First, we show that by modeling LD, we achieve accuracy superior to that of existing methods. Importantly, whereas other methods show decreased performance when dense marker panels are used in the inference, our approach improves in accuracy as more markers become available. Second, we show that accurate localization of genetic data can be achieved with only a part of the genome, and this could potentially enable the spatial localization of admixed samples that have a fraction of their genome originating from a given continent. Finally, we demonstrate that our approach is resistant to distortions resulting from long-range LD regions; such distortions can dramatically bias the results when unaccounted for.
Collapse
|