1
|
Nievergelt CM, Maihofer AX, Shekhtman T, Libiger O, Wang X, Kidd KK, Kidd JR. Inference of human continental origin and admixture proportions using a highly discriminative ancestry informative 41-SNP panel. INVESTIGATIVE GENETICS 2013; 4:13. [PMID: 23815888 PMCID: PMC3699392 DOI: 10.1186/2041-2223-4-13] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Accepted: 05/14/2013] [Indexed: 02/07/2023]
Abstract
Background Accurate determination of genetic ancestry is of high interest for many areas such as biomedical research, personal genomics and forensics. It remains an important topic in genetic association studies, as it has been shown that population stratification, if not appropriately considered, can lead to false-positive and -negative results. While large association studies typically extract ancestry information from available genome-wide SNP genotypes, many important clinical data sets on rare phenotypes and historical collections assembled before the GWAS area are in need of a feasible method (i.e., ease of genotyping, small number of markers) to infer the geographic origin and potential admixture of the study subjects. Here we report on the development, application and limitations of a small, multiplexable ancestry informative marker (AIM) panel of SNPs (or AISNP) developed specifically for this purpose. Results Based on worldwide populations from the HGDP, a 41-AIM AISNP panel for multiplex application with the ABI SNPlex and a subset with 31 AIMs for the Sequenome iPLEX system were selected and found to be highly informative for inferring ancestry among the seven continental regions Africa, the Middle East, Europe, Central/South Asia, East Asia, the Americas and Oceania. The panel was found to be least informative for Eurasian populations, and additional AIMs for a higher resolution are suggested. A large reference set including over 4,000 subjects collected from 120 global populations was assembled to facilitate accurate ancestry determination. We show practical applications of this AIM panel, discuss its limitations for admixed individuals and suggest ways to incorporate ancestry information into genetic association studies. Conclusion We demonstrated the utility of a small AISNP panel specifically developed to discern global ancestry. We believe that it will find wide application because of its feasibility and potential for a wide range of applications.
Collapse
Affiliation(s)
- Caroline M Nievergelt
- Department of Psychiatry, School of Medicine, University of San Diego California, La Jolla, CA, 92093, USA.
| | | | | | | | | | | | | |
Collapse
|
2
|
Manuck TA, Lai Y, Meis PJ, Sibai B, Spong CY, Rouse DJ, Iams JD, Caritis SN, O'Sullivan MJ, Wapner RJ, Mercer B, Ramin SM, Peaceman AM. Admixture mapping to identify spontaneous preterm birth susceptibility loci in African Americans. Obstet Gynecol 2011; 117:1078-1084. [PMID: 21508746 PMCID: PMC3094723 DOI: 10.1097/aog.0b013e318214e67f] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Preterm birth is 1.5 times more common in African American (17.8%) than European American women (11.5%), even after controlling for confounding variables. We hypothesize that genetic factors may account for this disparity and can be identified by admixture mapping. METHODS This is a secondary analysis of women with at least one prior spontaneous preterm birth enrolled in a multicenter prospective study. DNA was extracted and whole-genome amplified from stored saliva samples. Self-identified African American patients were genotyped with a 1,509 single nucleotide polymorphism (SNP) commercially available admixture panel. A logarithm of odds locus-genome score of 1.5 or higher was considered suggestive and 2 or higher was considered significant for a disease locus. RESULTS One hundred seventy-seven African American women with one or more prior spontaneous preterm births were studied. One thousand four hundred fifty SNPs were in Hardy-Weinberg equilibrium and passed quality filters. Individuals had a mean of 78.3% to 87.9% African American ancestry for each SNP. A locus on chromosome 7q21-22 was suggestive of an association with spontaneous preterm birth before 37 weeks of gestation (three SNPs with logarithm of odds scores 1.50-1.99). This signal strengthened when women with at least one preterm birth before 35.0 (eight SNPs with logarithm of odds scores greater than 1.50) and before 32.0 weeks of gestation were considered (15 SNPs with logarithm of odds scores greater than 1.50). No other areas of the genome had logarithm of odds scores higher than 1.5. CONCLUSION Spontaneous preterm birth in African American women may be genetically mediated by a susceptibility locus on chromosome 7. This region contains multiple potential candidate genes, including collagen type 1-α-2 gene and genes involved with calcium regulation.
Collapse
Affiliation(s)
- Tracy A Manuck
- From the Department of Obstetrics and Gynecology at University of Utah, Salt Lake City, Utah; Wake Forest University Health Sciences, Winston-Salem, North Carolina; University of Tennessee, Memphis, Tennessee; University of Alabama at Birmingham, Birmingham, Alabama; The Ohio State University, Columbus, Ohio; University of Pittsburgh, Pittsburgh, Pennsylvania; University of Miami, Miami, Florida; Drexel University, Philadelphia, Pennsylvania; Case Western Reserve University-MetroHealth Medical Center, Cleveland, Ohio; University of Texas Health Science Center at Houston, Houston, Texas; Northwestern University, Chicago, Illinois; The George Washington University Biostatistics Center, Washington, DC; and the Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Shih PAB, Wang L, Chiron S, Wen G, Nievergelt C, Mahata M, Khandrika S, Rao F, Fung MM, Mahata SK, Hamilton BA, O'Connor DT. Peptide YY (PYY) gene polymorphisms in the 3'-untranslated and proximal promoter regions regulate cellular gene expression and PYY secretion and metabolic syndrome traits in vivo. J Clin Endocrinol Metab 2009; 94:4557-66. [PMID: 19820027 PMCID: PMC2775651 DOI: 10.1210/jc.2009-0465] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
RATIONALE Obesity is a heritable trait that contributes to hypertension and subsequent cardiorenal disease risk; thus, the investigation of genetic variation that predisposes individuals to obesity is an important goal. Circulating peptide YY (PYY) is known for its appetite and energy expenditure-regulating properties; linkage and association studies have suggested that PYY genetic variation contributes to susceptibility for obesity, rendering PYY an attractive candidate for study of disease risk. DESIGN To explore whether common genetic variation at the human PYY locus influences plasma PYY or metabolic traits, we systematically resequenced the gene for polymorphism discovery and then genotyped common single-nucleotide polymorphisms across the locus in an extensively phenotyped twin sample to determine associations. Finally, we experimentally validated the marker-on-trait associations using PYY 3'-untranslated region (UTR)/reporter and promoter/reporter analyses in neuroendocrine cells. RESULTS Four common genetic variants were discovered across the locus, and three were typed in phenotyped twins. Plasma PYY was highly heritable (P < 0.0001), and genetic pleiotropy was noted between plasma PYY and body mass index (BMI) (P = 0.03). A PYY haplotype extending from the proximal promoter (A-23G, rs2070592) to the 3'-UTR (C+1134A, rs162431) predicted not only plasma PYY (P = 0.009) but also other metabolic syndrome traits. Functional studies with transfected luciferase reporters confirmed regulatory roles in altering gene expression for both 3'-UTR C+1134A (P < 0.001) and promoter A-23G (P = 0.0016). CONCLUSIONS Functional genetic variation at the PYY locus influences multiple heritable metabolic syndrome traits, likely conferring susceptibility to obesity and subsequent cardiorenal disease.
Collapse
Affiliation(s)
- Pei-An Betty Shih
- Department of Medicine and Pharmacology, Institute for Genomic Medicine, University of California, San Diego 92093-0838, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Cintado A, Companioni O, Nazabal M, Camacho H, Ferrer A, De Cossio MEF, Marrero A, Ale M, Villarreal A, Leal L, Casalvilla R, Benitez J, Novoa L, Diaz-Horta O, Dueñas M. Admixture estimates for the population of Havana City. Ann Hum Biol 2009; 36:350-60. [DOI: 10.1080/03014460902817984] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- A. Cintado
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - O. Companioni
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - M. Nazabal
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - H. Camacho
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - A. Ferrer
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | | | - A. Marrero
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - M. Ale
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - A. Villarreal
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - L. Leal
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - R. Casalvilla
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - J. Benitez
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - L. Novoa
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| | - O. Diaz-Horta
- National Institute of Endocrinology, Zapata y D, Havana, 10400, Cuba
| | - M. Dueñas
- Center of Genetic Engineering and Biotechnology, Havana, 10600, Cuba
| |
Collapse
|
5
|
Rao M, Balakrishnan VS. The genetic basis of kidney disease risk in African Americans: MYH9 as a new candidate gene. Am J Kidney Dis 2009; 53:579-83. [PMID: 19324247 DOI: 10.1053/j.ajkd.2009.02.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2009] [Accepted: 02/09/2009] [Indexed: 11/11/2022]
Affiliation(s)
- Madhumathi Rao
- Division of Nephrology, Tufts Medical Center, Boston, Massachusetts 02111, USA.
| | | |
Collapse
|
6
|
Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PLoS One 2007; 2:e1244. [PMID: 18043751 PMCID: PMC2080759 DOI: 10.1371/journal.pone.0001244] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2007] [Accepted: 11/06/2007] [Indexed: 01/17/2023] Open
Abstract
Migration patterns in modern societies have created the opportunity to use population admixture as a strategy to identify susceptibility genes. To implement this strategy, we genotyped a highly informative ancestry marker panel of 2270 single nucleotide polymorphisms in a random population sample of African Americans (N = 1743), European Americans (N = 1000) and Mexican Americans (N = 581). We then examined the evidence for over-transmission of specific loci to cases from one of the two ancestral populations. Hypertension cases and controls were defined based on standard clinical criteria. Both case-only and case-control analyses were performed among African Americans. With the genome-wide markers we replicated the findings identified in our previous admixture mapping study on chromosomes 6 and 21 [1]. For case-control analysis we then genotyped 51 missense SNPs in 36 genes spaced across an 18.3 Mb region. Further analyses demonstrated that the missense SNP rs2272996 (or N131S) in the VNN1 gene was significantly associated with hypertension in African Americans and the association was replicated in Mexican Americans; a non-significant opposite association was observed in European Americans. This SNP also accounted for most of the evidence observed in the admixture analysis on chromosome 6. Despite these encouraging results, susceptibility loci for hypertension have been exceptionally difficult to localize and confirmation by independent studies will be necessary to establish these findings.
Collapse
|
7
|
Mao X, Bigham AW, Mei R, Gutierrez G, Weiss KM, Brutsaert TD, Leon-Velarde F, Moore LG, Vargas E, McKeigue PM, Shriver MD, Parra EJ. A genomewide admixture mapping panel for Hispanic/Latino populations. Am J Hum Genet 2007; 80:1171-8. [PMID: 17503334 PMCID: PMC1867104 DOI: 10.1086/518564] [Citation(s) in RCA: 172] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2006] [Accepted: 03/13/2007] [Indexed: 12/27/2022] Open
Abstract
Admixture mapping (AM) is a promising method for the identification of genetic risk factors for complex traits and diseases showing prevalence differences among populations. Efficient application of this method requires the use of a genomewide panel of ancestry-informative markers (AIMs) to infer the population of origin of chromosomal regions in admixed individuals. Genomewide AM panels with markers showing high frequency differences between West African and European populations are already available for disease-gene discovery in African Americans. However, no such a map is yet available for Hispanic/Latino populations, which are the result of two-way admixture between Native American and European populations or of three-way admixture of Native American, European, and West African populations. Here, we report a genomewide AM panel with 2,120 AIMs showing high frequency differences between Native American and European populations. The average intermarker genetic distance is ~1.7 cM. The panel was identified by genotyping, with the Affymetrix GeneChip Human Mapping 500K array, a population sample with European ancestry, a Mesoamerican sample comprising Maya and Nahua from Mexico, and a South American sample comprising Aymara/Quechua from Bolivia and Quechua from Peru. The main criteria for marker selection were both high information content for Native American/European ancestry (measured as the standardized variance of the allele frequencies, also known as "f value") and small frequency differences between the Mesoamerican and South American samples. This genomewide AM panel will make it possible to apply AM approaches in many admixed populations throughout the Americas.
Collapse
Affiliation(s)
- Xianyun Mao
- Department of Anthropology, The Pennsylvania State University, University Park, PA 16801, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Schork NJ, Greenwood TA, Braff DL. Statistical genetics concepts and approaches in schizophrenia and related neuropsychiatric research. Schizophr Bull 2007; 33:95-104. [PMID: 17035359 PMCID: PMC2632283 DOI: 10.1093/schbul/sbl045] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Statistical genetics is a research field that focuses on mathematical models and statistical inference methodologies that relate genetic variations (ie, naturally occurring human DNA sequence variations or "polymorphisms") to particular traits or diseases (phenotypes) usually from data collected on large samples of families or individuals. The ultimate goal of such analysis is the identification of genes and genetic variations that influence disease susceptibility. Although of extreme interest and importance, the fact that many genes and environmental factors contribute to neuropsychiatric diseases of public health importance (eg, schizophrenia, bipolar disorder, and depression) complicates relevant studies and suggests that very sophisticated mathematical and statistical modeling may be required. In addition, large-scale contemporary human DNA sequencing and related projects, such as the Human Genome Project and the International HapMap Project, as well as the development of high-throughput DNA sequencing and genotyping technologies have provided statistical geneticists with a great deal of very relevant and appropriate information and resources. Unfortunately, the use of these resources and their interpretation are not straightforward when applied to complex, multifactorial diseases such as schizophrenia. In this brief and largely nonmathematical review of the field of statistical genetics, we describe many of the main concepts, definitions, and issues that motivate contemporary research. We also provide a discussion of the most pressing contemporary problems that demand further research if progress is to be made in the identification of genes and genetic variations that predispose to complex neuropsychiatric diseases.
Collapse
Affiliation(s)
- Nicholas J Schork
- Department of Psychiatry, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0603, USA.
| | | | | |
Collapse
|
9
|
Martinez-Marignac VL, Valladares A, Cameron E, Chan A, Perera A, Globus-Goldberg R, Wacher N, Kumate J, McKeigue P, O'Donnell D, Shriver MD, Cruz M, Parra EJ. Admixture in Mexico City: implications for admixture mapping of type 2 diabetes genetic risk factors. Hum Genet 2006; 120:807-19. [PMID: 17066296 DOI: 10.1007/s00439-006-0273-3] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2006] [Accepted: 10/02/2006] [Indexed: 11/30/2022]
Abstract
Admixture mapping is a recently developed method for identifying genetic risk factors involved in complex traits or diseases showing prevalence differences between major continental groups. Type 2 diabetes (T2D) is at least twice as prevalent in Native American populations as in populations of European ancestry, so admixture mapping is well suited to study the genetic basis of this complex disease. We have characterized the admixture proportions in a sample of 286 unrelated T2D patients and 275 controls from Mexico City and we discuss the implications of the results for admixture mapping studies. Admixture proportions were estimated using 69 autosomal ancestry-informative markers (AIMs). Maternal and paternal contributions were estimated from geographically informative mtDNA and Y-specific polymorphisms. The average proportions of Native American, European and, West African admixture were estimated as 65, 30, and 5%, respectively. The contributions of Native American ancestors to maternal and paternal lineages were estimated as 90 and 40%, respectively. In a logistic model with higher educational status as dependent variable, the odds ratio for higher educational status associated with an increase from 0 to 1 in European admixture proportions was 9.4 (95%, credible interval 3.8-22.6). This association of socioeconomic status with individual admixture proportion shows that genetic stratification in this population is paralleled, and possibly maintained, by socioeconomic stratification. The effective number of generations back to unadmixed ancestors was 6.7 (95% CI 5.7-8.0), from which we can estimate that genome-wide admixture mapping will require typing about 1,400 evenly distributed AIMs to localize genes underlying disease risk between populations of European and Native American ancestry. Sample sizes of about 2,000 cases will be required to detect any locus that contributes an ancestry risk ratio of at least 1.5.
Collapse
Affiliation(s)
- Veronica L Martinez-Marignac
- Department of Anthropology, University of Toronto at Mississauga, 3359 Mississauga Rd. Room 4026, South Bldg, L5L 1C6, Mississauga, ON, Canada,
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Redden DT, Divers J, Vaughan LK, Tiwari HK, Beasley TM, Fernández JR, Kimberly RP, Feng R, Padilla MA, Liu N, Miller MB, Allison DB. Regional admixture mapping and structured association testing: conceptual unification and an extensible general linear model. PLoS Genet 2006; 2:e137. [PMID: 16934005 PMCID: PMC1557785 DOI: 10.1371/journal.pgen.0020137] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2005] [Accepted: 07/18/2006] [Indexed: 12/21/2022] Open
Abstract
Individual genetic admixture estimates, determined both across the genome and at specific genomic regions, have been proposed for use in identifying specific genomic regions harboring loci influencing phenotypes in regional admixture mapping (RAM). Estimates of individual ancestry can be used in structured association tests (SAT) to reduce confounding induced by various forms of population substructure. Although presented as two distinct approaches, we provide a conceptual framework in which both RAM and SAT are special cases of a more general linear model. We clarify which variables are sufficient to condition upon in order to prevent spurious associations and also provide a simple closed form “semiparametric” method of evaluating the reliability of individual admixture estimates. An estimate of the reliability of individual admixture estimates is required to make an inherent errors-in-variables problem tractable. Casting RAM and SAT methods as a general linear model offers enormous flexibility enabling application to a rich set of phenotypes, populations, covariates, and situations, including interaction terms and multilocus models. This approach should allow far wider use of RAM and SAT, often using standard software, in addressing admixture as either a confounder of association studies or a tool for finding loci influencing complex phenotypes in species as diverse as plants, humans, and nonhuman animals. In recent years, scientific efforts to find genes influencing disease and health-related traits have sought to capitalize on the unique genetic characteristics of admixed populations. Admixture can refer to the event of two or more genetically diverse populations intermating and producing an admixed population. Admixture creates the potential for efficient identification of trait-influencing genes. However, genetic association studies using admixed populations are also prone to incorrectly concluding that a gene is linked and associated with a trait even when it is not. Several researchers have produced promising statistical methodologies for genetic association studies within admixed populations. In this paper, the authors show how these statistical methods can be unified in a broadly applicable regression framework and discuss which variables should be included in the regression models for valid testing. Because the variables required in this regression framework can only be measured with error, the authors show the consequences of these measurement errors and present measurement error correction methods applicable to this problem. By recasting the statistical methods for genetic association studies within admixed populations as regression models, a broader range of modeling and hypothesis testing becomes available.
Collapse
Affiliation(s)
- David T Redden
- Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|