351
|
Shriver MD, Kennedy GC, Parra EJ, Lawson HA, Sonpar V, Huang J, Akey JM, Jones KW. The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs. Hum Genomics 2005; 1:274-86. [PMID: 15588487 PMCID: PMC3525267 DOI: 10.1186/1479-7364-1-4-274] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Understanding the nature of evolutionary relationships among persons and populations
is important for the efficient application of genome science to biomedical research.
We have analysed 8,525 autosomal single nucleotide polymorphisms (SNPs) in 84
individuals from four populations: African-American, European-American, Chinese and
Japanese. Individual relationships were reconstructed using the allele sharing
distance and the neighbour-joining tree making method. Trees show clear clustering
according to population, with the root branching from the African-American clade. The
African-American cluster is much less star-like than European-American and East Asian
clusters, primarily because of admixture. Furthermore, on the East Asian branch, all
ten Chinese individuals cluster together and all ten Japanese individuals cluster
together. Using positional information, we demonstrate strong correlations between
inter-marker distance and both locus-specific FST (the proportion of total
variation due to differentiation) levels and branch lengths. Chromosomal maps of the
distribution of locus-specific branch lengths were constructed by combining these
data with other published SNP markers (total of 33,704 SNPs). These maps clearly
illustrate a non-uniform distribution of human genetic substructure, an instructional
and useful paradigm for education and research.
Collapse
Affiliation(s)
- Mark D Shriver
- Penn State University, University Park, Pennsylvania, USA.
| | | | | | | | | | | | | | | |
Collapse
|
352
|
Wang WYS, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 2005; 6:109-18. [PMID: 15716907 DOI: 10.1038/nrg1522] [Citation(s) in RCA: 747] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
To fully understand the allelic variation that underlies common diseases, complete genome sequencing for many individuals with and without disease is required. This is still not technically feasible. However, recently it has become possible to carry out partial surveys of the genome by genotyping large numbers of common SNPs in genome-wide association studies. Here, we outline the main factors - including models of the allelic architecture of common diseases, sample size, map density and sample-collection biases - that need to be taken into account in order to optimize the cost efficiency of identifying genuine disease-susceptibility loci.
Collapse
Affiliation(s)
- William Y S Wang
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK
| | | | | | | |
Collapse
|
353
|
Reiner AP, Ziv E, Lind DL, Nievergelt CM, Schork NJ, Cummings SR, Phong A, Burchard EG, Harris TB, Psaty BM, Kwok PY. Population structure, admixture, and aging-related phenotypes in African American adults: the Cardiovascular Health Study. Am J Hum Genet 2005; 76:463-77. [PMID: 15660291 PMCID: PMC1196398 DOI: 10.1086/428654] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2004] [Accepted: 01/06/2005] [Indexed: 11/03/2022] Open
Abstract
U.S. populations are genetically admixed, but surprisingly little empirical data exists documenting the impact of such heterogeneity on type I and type II error in genetic-association studies of unrelated individuals. By applying several complementary analytical techniques, we characterize genetic background heterogeneity among 810 self-identified African American subjects sampled as part of a multisite cohort study of cardiovascular disease in older adults. On the basis of the typing of 24 ancestry-informative biallelic single-nucleotide-polymorphism markers, there was evidence of substantial population substructure and admixture. We used an allele-sharing-based clustering algorithm to infer evidence for four genetically distinct subpopulations. Using multivariable regression models, we demonstrate the complex interplay of genetic and socioeconomic factors on quantitative phenotypes related to cardiovascular disease and aging. Blood glucose level correlated with individual African ancestry, whereas body mass index was associated more strongly with genetic similarity. Blood pressure, HDL cholesterol level, C-reactive protein level, and carotid wall thickness were not associated with genetic background. Blood pressure and HDL cholesterol level varied by geographic site, whereas C-reactive protein level differed by occupation. Both ancestry and genetic similarity predicted the number and quality of years lived during follow-up, but socioeconomic factors largely accounted for these associations. When the 24 genetic markers were tested individually, there were an excess number of marker-trait associations, most of which were attenuated by adjustment for genetic ancestry. We conclude that the genetic demography underlying older individuals who self identify as African American is complex, and that controlling for both genetic admixture and socioeconomic characteristics will be required in assessing genetic associations with chronic-disease-related traits in African Americans. Complementary methods that identify discrete subgroups on the basis of genetic similarity may help to further characterize the complex biodemographic structure of human populations.
Collapse
Affiliation(s)
- Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98101-1448, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
354
|
Bonilla C, Boxill LA, Donald SAM, Williams T, Sylvester N, Parra EJ, Dios S, Norton HL, Shriver MD, Kittles RA. The 8818G allele of the agouti signaling protein (ASIP) gene is ancestral and is associated with darker skin color in African Americans. Hum Genet 2005; 116:402-6. [PMID: 15726415 DOI: 10.1007/s00439-004-1251-2] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Accepted: 12/14/2004] [Indexed: 12/20/2022]
Abstract
Skin color, a predictor of social interactions and risk factor for several types of cancer, is due to two contrasting forms of melanin, the darker eumelanin and lighter phaeomelanin. The lighter pigment phaeomelanin is the product of the antagonistic function of the agouti signaling protein (ASIP) on the alpha-melanocyte stimulating hormone receptor (MC1R). Studies have shown that a single-nucleotide polymorphism (SNP) in the 3'UTR of the ASIP gene is associated with dark hair and eyes; however, little is known about its role in inter-individual variation in skin color. Here we examine the relationship between the ASIP g.8818A>G SNP and skin color (M index) as assessed by reflectometry in 234 African Americans. Analyses of variance (ANOVA) were performed to evaluate the effects of ASIP genotypes, age, individual ancestry, and sex on skin color variation. Significant effects on M index variation were observed for ASIP genotypes (F(2,236)=4.37, P=0.01), ancestry (F(1,243)=37.2, P<0.001), and sex (F(1,244)=4.08, P=0.05). Subsequent analyses revealed a strong effect on M index from ASIP genotypes in African American females (P<0.001). Our study suggests that the ASIP G>A polymorphism exhibits a dominant effect leading to lighter skin color and that variation in the ASIP gene may have been one of several factors contributing to reductions in pigmentation in some populations. Further study is needed to reveal how interactions between ASIP and several other genes, such as MC1R and P, predict human pigmentation.
Collapse
Affiliation(s)
- Carolina Bonilla
- Human Cancer Genetics Program, Comprehensive Cancer Center, The Ohio State University, 494 Tzagournis Medical Research Facility, 420 W. 12th Avenue, Columbus, OH 43210, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
355
|
Nakamura T, Shoji A, Fujisawa H, Kamatani N. Cluster analysis and association study of structured multilocus genotype data. J Hum Genet 2005; 50:53-61. [PMID: 15696377 DOI: 10.1007/s10038-004-0220-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2004] [Accepted: 11/05/2004] [Indexed: 11/28/2022]
Abstract
We propose an algorithm for testing association using structured multilocus genotype data. The algorithm implements the clustering of the data by a hierarchical clustering technique and a k-means algorithm. After clustering, the program analyzes all the clusters together using the Mantel-Haenszel (MH) test, by which common associations in the clusters are examined. To use the MH test, the number of subpopulations has to be determined. A method of cross-validation (CV) and the k-means algorithm are applied for estimating the number of subpopulations. The algorithm described was implemented in the computer program POPSTRUCT. In the simulation study, we found that when the two groups with different marker allele frequencies were combined, an inflation of the type I errors was observed. The inflation was more marked when the differences in the marker allele frequencies were larger, the difference in the minor allele frequencies at the disease locus was larger, and the genotype relative risk associated with the disease locus was higher. Our simulation study indicated that the MH test was efficient for decreasing type I errors and increasing the power compared with any test performed on each cluster. Then, we compared the results of STRUCTURE, a model-based method, and POPSTRUCT, a distance-based method. When two subgroups with different allele frequencies were mixed together at a high fixed ratio, POPSTRUCT was superior to STRUCTURE in classifying the combined population into the accurate clusters, each of which reflects one of the original groups.
Collapse
Affiliation(s)
- Takahiro Nakamura
- Division of Statistical Genetics, Institute of Rheumatology, Tokyo Women's Medical University, 10-22 Kawada-cho, Shinjuku-ku, Tokyo, 162-0054, Japan.
| | - Akira Shoji
- Division of Statistical Genetics, Institute of Rheumatology, Tokyo Women's Medical University, 10-22 Kawada-cho, Shinjuku-ku, Tokyo, 162-0054, Japan
| | | | - Naoyuki Kamatani
- Division of Statistical Genetics, Institute of Rheumatology, Tokyo Women's Medical University, 10-22 Kawada-cho, Shinjuku-ku, Tokyo, 162-0054, Japan
| |
Collapse
|
356
|
Abstract
In addition to viral and environmental/behavioural factors, host genetic diversity is believed to contribute to the spectrum of clinical outcomes in hepatitis C virus (HCV) infection. This paper reviews the literature with respect to studies of host genetic determinants of HCV outcome and attempts to highlight trends and synthesise findings. With respect to the susceptibility to HCV infection, several studies have replicated associations of the HLA class II alleles DQB1(*)0301 and DRB1(*)11 with self-limiting infection predominantly in Caucasian populations. Meta-analyses yielded summary estimates of 3.0 (95% CI: 1.8-4.8) and 2.5 (95% CI: 1.7-3.7) for the effects of DQB1(*)0301 and DRB1(*)11 on self-limiting HCV, respectively. Studies of genetics and the response to interferon-based therapies have largely concerned single-nucleotide polymorphisms and have been inconsistent. Regarding studies of genetics and the progression of HCV-related disease, there is a trend with DRB1(*)11 alleles and less severe disease. Studies of extrahepatic manifestations of chronic HCV have shown an association between DQB1(*)11 and DR3 with the formation of cryoglobulins. Some important initial observations have been made with respect to genetic determinants of HCV outcome. Replication studies are needed for many of these associations, as well as biological data on the function of many of these polymorphisms.
Collapse
Affiliation(s)
- L J Yee
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA.
| |
Collapse
|
357
|
Abstract
Admixture is an important evolutionary force that can and should be used in efforts to apply genomic data and technology to the study of complex disease genetics. Admixture linkage disequilibrium (ALD) is created by the process of admixture and, in recently admixed populations, extends for substantial distances (of the order of 10 to 20 cM). The amount of ALD generated depends on the level of admixture, ancestry information content of markers and the admixture dynamics of the population, and thus influences admixture mapping (AM). The authors discuss different models of admixture and how these can have an impact on the success of AM studies. Selection of markers is important, since markers informative for parental population ancestry are required and these are uncommon. Rarely does the process of admixture result in a population that is uniform for individual admixture levels, but instead there is substantial population stratification. This stratification can be understood as variation in individual admixtures and can be both a source of statistical power for ancestry-phenotype correlation studies as well as a confounder in causing false-positives in gene association studies. Methods to detect and control for stratification in case/control and AM studies are reviewed, along with recent studies showing individual ancestry-phenotype correlations. Using skin pigmentation as a model phenotype, implications of AM in complex disease gene mapping studies are discussed. Finally, the article discusses some limitations of this approach that should be considered when designing an effective AM study.
Collapse
Affiliation(s)
- Indrani Halder
- Department of Anthropology, Pennsylvania State University, University Park, PA 16801, USA
| | - Mark D Shriver
- Department of Anthropology, Pennsylvania State University, University Park, PA 16801, USA
| |
Collapse
|
358
|
Yang BZ, Zhao H, Kranzler HR, Gelernter J. Practical population group assignment with selected informative markers: Characteristics and properties of Bayesian clustering via STRUCTURE. Genet Epidemiol 2005; 28:302-12. [PMID: 15782414 DOI: 10.1002/gepi.20070] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Population stratification, which is caused by population genetic substructure (PGS), is a critical issue for the design and interpretation of genetic association studies. Methods to address this problem have been devised, but little is known at this point about practical genotyping requirements for resolving PGS based on different marker characteristics. In this report, we seek to (1) identify a small, practical marker set to differentiate African Americans (AAs) from European Americans (EAs), and (2) assess the impact of marker efficiency and sample size on clustering individuals into subgroups by the methods of STRUCTURE (Pritchard et al., [2000a] Genetics 155:945-959). A panel of 37 markers was genotyped for 865 individuals (640 EAs and 225 AAs) from the Northeastern United States. Among EAs, the assignment accuracy reached >99% using only the 4 most efficient markers. Among AAs, the assignment accuracy exceeded 95% when using the 6 most informative markers. Smaller sample size increased the variance in population differentiation, rather than degrading the results consistently. We conclude that the use of marker-efficiency measures for marker selection yielded a relatively small set of STR markers that were effective at differentiating EA and AA populations. The number of markers required is much lower than has been suggested in previous studies.
Collapse
Affiliation(s)
- Bao-Zhu Yang
- Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut 06516, USA
| | | | | | | |
Collapse
|
359
|
Shields AE, Fortun M, Hammonds EM, King PA, Lerman C, Rapp R, Sullivan PF. The use of race variables in genetic studies of complex traits and the goal of reducing health disparities: A transdisciplinary perspective. AMERICAN PSYCHOLOGIST 2005; 60:77-103. [PMID: 15641924 DOI: 10.1037/0003-066x.60.1.77] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The use of racial variables in genetic studies has become a matter of intense public debate, with implications for research design and translation into practice. Using research on smoking as a springboard, the authors examine the history of racial categories, current research practices, and arguments for and against using race variables in genetic analyses. The authors argue that the sociopolitical constructs appropriate for monitoring health disparities are not appropriate for use in genetic studies investigating the etiology of complex diseases. More powerful methods for addressing population structure exist, and race variables are unacceptable as gross proxies for numerous social/environmental factors that disproportionately affect minority populations. The authors conclude with recommendations for genetic researchers and policymakers, aimed at facilitating better science and producing new knowledge useful for reducing health disparities.
Collapse
Affiliation(s)
- Alexandra E Shields
- Health Policy Institute, Georgetown Public Policy Institute, Georgetown University, Washington, DC 20002, USA.
| | | | | | | | | | | | | |
Collapse
|
360
|
Sillanpää MJ, Bhattacharjee M. Bayesian association-based fine mapping in small chromosomal segments. Genetics 2005; 169:427-39. [PMID: 15371355 PMCID: PMC1448870 DOI: 10.1534/genetics.104.032680] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2004] [Accepted: 09/16/2004] [Indexed: 11/18/2022] Open
Abstract
A Bayesian method for fine mapping is presented, which deals with multiallelic markers (with two or more alleles), unknown phase, missing data, multiple causal variants, and both continuous and binary phenotypes. We consider small chromosomal segments spanned by a dense set of closely linked markers and putative genes only at marker points. In the phenotypic model, locus-specific indicator variables are used to control inclusion in or exclusion from marker contributions. To account for covariance between consecutive loci and to control fluctuations in association signals along a candidate region we introduce a joint prior for the indicators that depends on genetic or physical map distances. The potential of the method, including posterior estimation of trait-associated loci, their effects, linkage disequilibrium pattern due to close linkage of loci, and the age of a causal variant (time to most recent common ancestor), is illustrated with the well-known cystic fibrosis and Friedreich ataxia data sets by assuming that haplotypes were not available. In addition, simulation analysis with large genetic distances is shown. Estimation of model parameters is based on Markov chain Monte Carlo (MCMC) sampling and is implemented using WinBUGS. The model specification code is freely available for research purposes from http://www.rni.helsinki.fi/~mjs/.
Collapse
Affiliation(s)
- Mikko J Sillanpää
- Rolf Nevanlinna Institute, University of Helsinki, FIN-00014 Helsinki, Finland.
| | | |
Collapse
|
361
|
Abstract
The epidemiologic approach enables the systematic evaluation of potential improvements in the safety and efficacy of drug treatment which might result from targeting treatment on the basis of genomic information. The main epidemiologic designs are the randomized control trial, the cohort study, and the case-control study, and derivatives of these proposed for investigating gene-environment interactions. However, no one design is ideal for every situation, and methodological issues, notably selection bias, information bias, confounding and chance, all play a part in determining which study design is best for a given situation. There is also a need to employ a range of different designs to establish a portfolio of evidence about specific gene-drug interactions. In view of the complexity of gene-drug interactions, pooling of data across studies is likely to be needed in order to have adequate statistical power to test hypotheses. We suggest that there may be opportunities (i) to exploit samples from trials already completed to investigate possible gene-drug interactions; (ii) to consider the use of the case-only design nested within randomized controlled trials as a possible means of reducing genotyping costs when dichotomous outcomes are being investigated; and (iii) to make use of population-based disease registries that can be linked with tissue samples, treatment information and death records, to investigate gene-treatment interactions in survival.
Collapse
Affiliation(s)
- Julian Little
- Department of Epidemiology and Community Medicine, University of Ottawa, 451 Smyth Rd, Ottawa, Ontario K1H 8M5, Canada.
| | | | | | | | | |
Collapse
|
362
|
McKeigue PM. Prospects for admixture mapping of complex traits. Am J Hum Genet 2005; 76:1-7. [PMID: 15540159 PMCID: PMC1196412 DOI: 10.1086/426949] [Citation(s) in RCA: 141] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2004] [Accepted: 10/07/2004] [Indexed: 01/23/2023] Open
Abstract
Admixture mapping extends to human populations the principles that underlie linkage analysis of an experimental cross. For detecting genes that contribute to ethnic variation in disease risk, admixture mapping has greater statistical power than family-linkage studies. In comparison with association studies, admixture mapping requires far fewer markers to search the genome and is less affected by allelic heterogeneity. Statistical-analysis programs for admixture mapping are now available, and a genomewide panel of markers for admixture mapping in populations formed by West African-European admixture has been assembled. Some of the remaining technical challenges include the ability to ensure that the statistical methods are robust and to develop marker panels for other admixed populations. Where admixed populations and panels of markers informative for ancestry are available, admixture mapping can be applied to localize genes that contribute to ethnic variation in any measurable trait.
Collapse
Affiliation(s)
- Paul M McKeigue
- Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland.
| |
Collapse
|
363
|
Parra EJ, Kittles RA, Shriver MD. Implications of correlations between skin color and genetic ancestry for biomedical research. Nat Genet 2004; 36:S54-60. [PMID: 15508005 DOI: 10.1038/ng1440] [Citation(s) in RCA: 166] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2004] [Accepted: 09/09/2004] [Indexed: 02/06/2023]
Abstract
Skin pigmentation is a central element of most discussions on 'race' and genetics. Research on the genetic basis of population variation in this phenotype, which is important in mediating both social experiences and environmental exposures, is sparse. We studied the relationship between pigmentation and ancestry in five populations of mixed ancestry with a wide range of pigmentation and ancestral proportions (African Americans from Washington, DC; African Caribbeans living in England; Puerto Ricans from New York; Mexicans from Guerrero; and Hispanics from San Luis Valley). The strength of the relationship between skin color and ancestry was quite variable, with the correlations ranging in intensity from moderately strong (Puerto Rico, rho = 0.633) to weak (Mexico, rho = 0.212). These results demonstrate the utility of ancestry-informative genetic markers and admixture methods and emphasize the need to be cautious when using pigmentation as a proxy of ancestry or when extrapolating the results from one admixed population to another.
Collapse
Affiliation(s)
- E J Parra
- Department of Anthropology, University of Toronto at Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | | | | |
Collapse
|
364
|
Montana G, Pritchard JK. Statistical tests for admixture mapping with case-control and cases-only data. Am J Hum Genet 2004; 75:771-89. [PMID: 15386213 PMCID: PMC1182107 DOI: 10.1086/425281] [Citation(s) in RCA: 121] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2004] [Accepted: 07/28/2004] [Indexed: 11/03/2022] Open
Abstract
Admixture mapping is a promising new tool for discovering genes that contribute to complex traits. This mapping approach uses samples from recently admixed populations to detect susceptibility loci at which the risk alleles have different frequencies in the original contributing populations. Although the idea for admixture mapping has been around for more than a decade, the genomic tools are only now becoming available to make this a feasible and attractive option for complex-trait mapping. In this article, we describe new statistical methods for analyzing multipoint data from admixture-mapping studies to detect "ancestry association." The new test statistics do not assume a particular disease model; instead, they are based simply on the extent to which the sample's ancestry proportions at a locus deviate from the genome average. Our power calculations show that, for loci at which the underlying risk-allele frequencies are substantially different in the ancestral populations, the power of admixture mapping can be comparable to that of association mapping but with a far smaller number of markers. We also show that, although "ancestry informative markers" (AIMs) are superior to random single-nucleotide polymorphisms (SNPs), random SNPs can perform quite well when AIMs are not available. Hence, researchers who study admixed populations in which AIMs are not available can perform admixture mapping with the use of modestly higher densities of random markers. Software to perform the gene-mapping calculations, "MALDsoft," is freely available on the Pritchard Lab Web site.
Collapse
Affiliation(s)
- Giovanni Montana
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | | |
Collapse
|
365
|
Shi Y, Zhao X, Yu L, Tao R, Tang J, La Y, Duan Y, Gao B, Gu N, Xu Y, Feng G, Zhu S, Liu H, Salter H, He L. Genetic structure adds power to detect schizophrenia susceptibility at SLIT3 in the Chinese Han population. Genome Res 2004; 14:1345-9. [PMID: 15231749 PMCID: PMC442150 DOI: 10.1101/gr.1758204] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The Chinese Han population, the largest population in the world, has traditionally been geographically divided into two parts, the Southern Han and Northern Han. In practice, however, these commonly used ethnic labels are both insufficient and inaccurate as descriptors of inferred genetic clustering, and can lead to the observation of "spurious association" as well as the concealment of real association. In this study, we attempted to address this problem by using 14 microsatellite markers to reconstruct the population genetic structure in 768 Han Chinese samples, including 384 Southern Han and 384 Northern Han, and in samples from Chinese minorities including 48 Yao and 48 BouYei subjects. Furthermore, with a dense set of markers around the region 5q34-35, we built fine-scale haplotype networks for each population/subpopulation and tested for association to schizophrenia susceptibility. We found that more variants in SLIT3 tend to associate with schizophrenia susceptibility in the genetically structured samples, compared to geographically structured samples and samples without identified population substructure. Our results imply that identifying the hidden genetic substructure adds power when detecting association, and suggest that SLIT3 or a nearby gene is associated with schizophrenia.
Collapse
Affiliation(s)
- YongYong Shi
- Bio-X Life Science Research Center, Shanghai Jiao Tong University, Shanghai 200030, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
366
|
Cuthbert AP, Fisher SA, Lewis CM, Mathew CG, Sanderson J, Forbes A. Genetic association between EPHX1 and Crohn's disease: population stratification, genotyping error, or random chance? Gut 2004; 53:1386. [PMID: 15306604 PMCID: PMC1774173 DOI: 10.1136/gut.2003.032946] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/08/2022]
|
367
|
Simko I. One potato, two potato: haplotype association mapping in autotetraploids. TRENDS IN PLANT SCIENCE 2004; 9:441-8. [PMID: 15337494 DOI: 10.1016/j.tplants.2004.07.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Affiliation(s)
- Ivan Simko
- USDA-ARS, Vegetable Laboratory, Bldg 010A, 10300 Baltimore Avenue, Beltsville, MD 20705, USA.
| |
Collapse
|
368
|
Redden DT, Allison DB. Nonreplication in genetic association studies of obesity and diabetes research. J Nutr 2004; 133:3323-6. [PMID: 14608039 DOI: 10.1093/jn/133.11.3323] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The objective of this article is to provide an overview of the existing literature concerning the identification of genetic markers associated with obesity and diabetes. Specifically, this article will review recent association studies of diabetes and obesity with an emphasis on the need for the replication of findings. Unfortunately, a substantial number of the published associations between genetic markers and phenotypes, including diabetes and obesity, have not been replicated. Literature that addresses the potential reasons for the nonreplication of association studies (population stratification, publication bias, effect heterogeneity, Type I errors and lack of statistical power) is summarized. Recommendations to improve future association studies are presented.
Collapse
Affiliation(s)
- David T Redden
- Department of Biostatistics, Section on Statistical Genetics and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham, Alabama, USA.
| | | |
Collapse
|
369
|
Bamshad M, Wooding S, Salisbury BA, Stephens JC. Deconstructing the relationship between genetics and race. Nat Rev Genet 2004; 5:598-609. [PMID: 15266342 DOI: 10.1038/nrg1401] [Citation(s) in RCA: 237] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Michael Bamshad
- Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA.
| | | | | | | |
Collapse
|
370
|
Shriver MD, Kittles RA. Genetic ancestry and the search for personalized genetic histories. Nat Rev Genet 2004; 5:611-8. [PMID: 15266343 DOI: 10.1038/nrg1405] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Mark D Shriver
- Department of Anthropology, Penn State University, University Park, Pennsylvania 16802, USA.
| | | |
Collapse
|
371
|
Koller DL, Peacock M, Lai D, Foroud T, Econs MJ. False positive rates in association studies as a function of degree of stratification. J Bone Miner Res 2004; 19:1291-5. [PMID: 15231016 DOI: 10.1359/jbmr.040409] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2003] [Revised: 03/22/2004] [Accepted: 04/19/2004] [Indexed: 11/18/2022]
Abstract
UNLABELLED To explore the degree to which stratification can cause spurious positive association results, we tested for association between BMD and 373 genetic markers using 381 white and 126 black females. The rate of positive results doubled as the proportion of stratification increased, showing the importance of controlling for stratification in association studies. INTRODUCTION Population-based association studies are commonly used to test the relationship between polymorphisms in a candidate gene and a disease or trait of interest. Although the collection of samples for this type of study design is relatively cost-effective, the statistical analysis may be susceptible to false positive results because of the effects of population stratification. Such results may occur when the underlying populations differ in both the polymorphism allele frequency and mean trait value. MATERIALS AND METHODS To explore the degree to which stratification can cause spurious positive association results, we analyzed femoral neck BMD data from an unrelated sample of 381 white and 126 black premenopausal females. As part of a previous genome screen, 373 microsatellite markers had been genotyped for each individual. For simplicity of interpretation, each multiallelic marker was reduced to a biallelic marker, with the most common allele as one allele and all other alleles combined as the second allele. As expected, the black women differed substantially for marker allele frequencies and had significantly higher mean femoral neck BMD than their white counterparts. Random subsets of the white and black samples were sampled, with increasing proportions of stratification (0%, 1%, 2%, 5%, 10%, 15%, and 20% black subjects) in the total analyzed sample. ANOVA was used to test for association between the recoded marker and femoral neck BMD. RESULTS AND CONCLUSIONS The rate of positive results for the association test were observed to double as the proportion of stratification increased, with substantial increases in the frequency of false positives even for stratification proportions as small as 2-5%. These results show the importance of controlling for stratification when the trait and the polymorphism allele frequency differ between the races.
Collapse
Affiliation(s)
- Daniel L Koller
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana 46202-5251, USA.
| | | | | | | | | |
Collapse
|
372
|
Bonilla C, Shriver MD, Parra EJ, Jones A, Fernández JR. Ancestral proportions and their association with skin pigmentation and bone mineral density in Puerto Rican women from New York city. Hum Genet 2004; 115:57-68. [PMID: 15118905 DOI: 10.1007/s00439-004-1125-7] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2003] [Accepted: 03/18/2004] [Indexed: 12/01/2022]
Abstract
Hispanic and African American populations exhibit an increased risk of obesity compared with populations of European origin, a feature that may be related to inherited risk alleles from Native American and West African parental populations. However, a relationship between West African ancestry and obesity-related traits, such as body mass index (BMI), fat mass (FM), and fat-free mass (FFM), and with bone mineral density (BMD) in African American women has only recently been reported. In order to evaluate further the influence of ancestry on body composition phenotypes, we studied a Hispanic population with substantial European, West African, and Native American admixture. We ascertained a sample of Puerto Rican women living in New York ( n=64), for whom we measured BMI and body composition variables, such as FM, FFM, percent body fat, and BMD. Additionally, skin pigmentation was measured as the melanin index by reflectance spectroscopy. We genotyped 35 autosomal ancestry informative markers and estimated population and individual ancestral proportions in terms of European, West African, and Native American contributions to this population. The ancestry proportions corresponding to the three parental populations are: 53.3+/-2.8% European, 29.1+/-2.3% West African, and 17.6+/-2.4% Native American. We detected significant genetic structure in this population with a number of different tests. A highly significant correlation was found between skin pigmentation and individual ancestry ( R(2)=0.597, P<0.001) that was not attributable to differences in socioeconomic status. A significant association was also found between BMD and European admixture ( R(2)=0.065, P=0.042), but no such correlation was evident with BMI or the remaining body composition measurements. We discuss the implications of our findings for the potential use of this Hispanic population for admixture mapping.
Collapse
Affiliation(s)
- Carolina Bonilla
- National Human Genome Center, Howard University, Washington, DC 20060, USA.
| | | | | | | | | |
Collapse
|
373
|
Affiliation(s)
- Mark A Beaumont
- School of Animal and Microbial Sciences, University of Reading, Whiteknights, P.O. Box 228, Reading RG6 6AJ, UK.
| | | |
Collapse
|
374
|
Seldin MF, Morii T, Collins-Schramm HE, Chima B, Kittles R, Criswell LA, Li H. Putative ancestral origins of chromosomal segments in individual african americans: implications for admixture mapping. Genome Res 2004; 14:1076-84. [PMID: 15140829 PMCID: PMC419786 DOI: 10.1101/gr.2165904] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Theoretically, markers that distinguish European from West African ancestry can be used to examine the origin of chromosomal segments in individual African Americans. In this study, putative ancestral origin was examined by using haplotypes estimated from genotyping 268 African Americans for 29 ancestry informative markers spaced over a 60-cM segment of chromosome 5. Analyses using a Bayesian algorithm (STRUCTURE) provided evidence that blocks of individual chromosomes derive from one or the other parental population. In addition, modeling studies were performed by using hidden real marker data to simulate patient and control populations under different genotypic risk ratios. Ancestry analysis showed significant results for a genotypic risk ratio of 2.5 in the African American population for modeled susceptibility genes derived from either putative parental population. These studies suggest that admixture mapping in the African American population can provide a powerful approach to defining genetic factors for some disease phenotypes.
Collapse
Affiliation(s)
- Michael F Seldin
- Rowe Program in Human Genetics, Departments of Biological Chemistry and Medicine, University of California at Davis, Davis, California 95616-8669, USA.
| | | | | | | | | | | | | |
Collapse
|
375
|
Hoggart CJ, Shriver MD, Kittles RA, Clayton DG, McKeigue PM. Design and analysis of admixture mapping studies. Am J Hum Genet 2004; 74:965-78. [PMID: 15088268 PMCID: PMC1181989 DOI: 10.1086/420855] [Citation(s) in RCA: 242] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2004] [Accepted: 03/02/2004] [Indexed: 01/22/2023] Open
Abstract
Admixture between populations originating on different continents can be exploited to detect disease susceptibility loci at which risk alleles are distributed differentially between these populations. We first examine the statistical power and mapping resolution of this approach in the limiting situation in which gamete admixture and locus ancestry are measured without uncertainty. We show that, for a rare disease, the most efficient design is to study affected individuals only. In a typical African American population (two-way admixture proportions 0.8/0.2, ancestry crossover rate 2 per 100 cM), a study of 800 affected individuals has 90% power to detect at P values <10(-5) a locus that generates a risk ratio of 2 between populations, with an expected mapping resolution (size of 95% confidence region for the position of the locus) of 4 cM. In practice, to infer locus ancestry from marker data requires Bayesian computationally intensive methods, as implemented in the program ADMIXMAP. Affected-only study designs require strong prior information on the frequencies of each allele given locus ancestry. We show how data from unadmixed and admixed populations can be combined to estimate these ancestry-specific allele frequencies within the admixed population under study, allowing for variation between allele frequencies in unadmixed and admixed populations. Using simulated data based on the genetic structure of the African American population, we show that 60% of information can be extracted in a test for linkage using markers with an ancestry information content of 36% at 3-cM spacing. As in classic linkage studies, the most efficient strategy is to use markers at a moderate density for an initial genome search and then to saturate regions of putative linkage with additional markers, to extract nearly all information about locus ancestry.
Collapse
Affiliation(s)
- C J Hoggart
- Noncommunicable Disease Epidemiology Unit, London School of Hygiene & Tropical Medicine, London WC1E 7HT, United Kingdom.
| | | | | | | | | |
Collapse
|
376
|
Smith MW, Patterson N, Lautenberger JA, Truelove AL, McDonald GJ, Waliszewska A, Kessing BD, Malasky MJ, Scafe C, Le E, De Jager PL, Mignault AA, Yi Z, De The G, Essex M, Sankale JL, Moore JH, Poku K, Phair JP, Goedert JJ, Vlahov D, Williams SM, Tishkoff SA, Winkler CA, De La Vega FM, Woodage T, Sninsky JJ, Hafler DA, Altshuler D, Gilbert DA, O'Brien SJ, Reich D. A high-density admixture map for disease gene discovery in african americans. Am J Hum Genet 2004; 74:1001-13. [PMID: 15088270 PMCID: PMC1181963 DOI: 10.1086/420856] [Citation(s) in RCA: 379] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2004] [Accepted: 03/03/2004] [Indexed: 11/03/2022] Open
Abstract
Admixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) provides a way of localizing genes that cause disease, in admixed ethnic groups such as African Americans, with approximately 100 times fewer markers than are required for whole-genome haplotype scans. However, it has not been possible to perform powerful scans with admixture mapping because the method requires a dense map of validated markers known to have large frequency differences between Europeans and Africans. To create such a map, we screened through databases containing approximately 450000 single-nucleotide polymorphisms (SNPs) for which frequencies had been estimated in African and European population samples. We experimentally confirmed the frequencies of the most promising SNPs in a multiethnic panel of unrelated samples and identified 3011 as a MALD map (1.2 cM average spacing). We estimate that this map is approximately 70% informative in differentiating African versus European origins of chromosomal segments. This map provides a practical and powerful tool, which is freely available without restriction, for screening for disease genes in African American patient cohorts. The map is especially appropriate for those diseases that differ in incidence between the parental African and European populations.
Collapse
Affiliation(s)
- Michael W Smith
- Laboratory of Genomic Diversity, National Cancer Institute, Frederick, MD, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
377
|
Patterson N, Hattangadi N, Lane B, Lohmueller KE, Hafler DA, Oksenberg JR, Hauser SL, Smith MW, O’Brien SJ, Altshuler D, Daly MJ, Reich D. Methods for high-density admixture mapping of disease genes. Am J Hum Genet 2004; 74:979-1000. [PMID: 15088269 PMCID: PMC1181990 DOI: 10.1086/420871] [Citation(s) in RCA: 369] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2004] [Accepted: 03/03/2004] [Indexed: 01/12/2023] Open
Abstract
Admixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) has been proposed as an efficient approach to localizing disease-causing variants that differ in frequency (because of either drift or selection) between two historically separated populations. Near a disease gene, patient populations descended from the recent mixing of two or more ethnic groups should have an increased probability of inheriting the alleles derived from the ethnic group that carries more disease-susceptibility alleles. The central attraction of admixture mapping is that, since gene flow has occurred recently in modern populations (e.g., in African and Hispanic Americans in the past 20 generations), it is expected that admixture-generated linkage disequilibrium should extend for many centimorgans. High-resolution marker sets are now becoming available to test this approach, but progress will require (a). computational methods to infer ancestral origin at each point in the genome and (b). empirical characterization of the general properties of linkage disequilibrium due to admixture. Here we describe statistical methods to estimate the ancestral origin of a locus on the basis of the composite genotypes of linked markers, and we show that this approach accurately estimates states of ancestral origin along the genome. We apply this approach to show that strong admixture linkage disequilibrium extends, on average, for 17 cM in African Americans. Finally, we present power calculations under varying models of disease risk, sample size, and proportions of ancestry. Studying approximately 2500 markers in approximately 2500 patients should provide power to detect many regions contributing to common disease. A particularly important result is that the power of an admixture mapping study to detect a locus will be nearly the same for a wide range of mixture scenarios: the mixture proportion should be 10%-90% from both ancestral populations.
Collapse
Affiliation(s)
- Nick Patterson
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Neil Hattangadi
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Barton Lane
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Kirk E. Lohmueller
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - David A. Hafler
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Jorge R. Oksenberg
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Stephen L. Hauser
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Michael W. Smith
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Stephen J. O’Brien
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - David Altshuler
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - Mark J. Daly
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| | - David Reich
- Program in Medical and Population Genetics, Broad Institute, and Whitehead Institute for Biomedical Research, Cambridge, MA; Department of Genetics and Laboratory of Molecular Immunology, Harvard Medical School, Departments of Medicine and Molecular Biology, Massachusetts General Hospital, and Center for Neurologic Disease, Brigham and Women's Hospital, Boston; Georgetown University, Washington, DC; Department of Neurology, University of California at San Francisco, San Francisco; and Laboratory of Genomic Diversity, National Cancer Institute, and Basic Research Program, Science Applications International Corporation, Frederick, MD
| |
Collapse
|
378
|
Marchini J, Cardon LR, Phillips MS, Donnelly P. The effects of human population structure on large genetic association studies. Nat Genet 2004; 36:512-7. [PMID: 15052271 DOI: 10.1038/ng1337] [Citation(s) in RCA: 596] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2003] [Accepted: 03/12/2004] [Indexed: 01/21/2023]
Abstract
Large-scale association studies hold substantial promise for unraveling the genetic basis of common human diseases. A well-known problem with such studies is the presence of undetected population structure, which can lead to both false positive results and failures to detect genuine associations. Here we examine approximately 15,000 genome-wide single-nucleotide polymorphisms typed in three population groups to assess the consequences of population structure on the coming generation of association studies. The consequences of population structure on association outcomes increase markedly with sample size. For the size of study needed to detect typical genetic effects in common diseases, even the modest levels of population structure within population groups cannot safely be ignored. We also examine one method for correcting for population structure (Genomic Control). Although it often performs well, it may not correct for structure if too few loci are used and may overcorrect in other settings, leading to substantial loss of power. The results of our analysis can guide the design of large-scale association studies.
Collapse
Affiliation(s)
- Jonathan Marchini
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
| | | | | | | |
Collapse
|
379
|
Freedman ML, Reich D, Penney KL, McDonald GJ, Mignault AA, Patterson N, Gabriel SB, Topol EJ, Smoller JW, Pato CN, Pato MT, Petryshen TL, Kolonel LN, Lander ES, Sklar P, Henderson B, Hirschhorn JN, Altshuler D. Assessing the impact of population stratification on genetic association studies. Nat Genet 2004; 36:388-93. [PMID: 15052270 DOI: 10.1038/ng1333] [Citation(s) in RCA: 562] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2003] [Accepted: 02/23/2004] [Indexed: 11/09/2022]
Abstract
Population stratification refers to differences in allele frequencies between cases and controls due to systematic differences in ancestry rather than association of genes with disease. It has been proposed that false positive associations due to stratification can be controlled by genotyping a few dozen unlinked genetic markers. To assess stratification empirically, we analyzed data from 11 case-control and case-cohort association studies. We did not detect statistically significant evidence for stratification but did observe that assessments based on a few dozen markers lack power to rule out moderate levels of stratification that could cause false positive associations in studies designed to detect modest genetic risk factors. After increasing the number of markers and samples in a case-cohort study (the design most immune to stratification), we found that stratification was in fact present. Our results suggest that modest amounts of stratification can exist even in well designed studies.
Collapse
Affiliation(s)
- Matthew L Freedman
- Department of Medicine and Molecular Biology, Massachusetts General Hospital, Boston, and Program in Medical and Population Genetics, Broad Institute, Cambridge, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
380
|
Abstract
Pharmacogenetics, the inherited basis for interindividual differences in drug response, has rapidly expanded with the advent of new molecular tools and the sequencing of the human genome, yielding pharmacogenomics. We review here recent ideas and findings regarding pharmacogenomics of components of the autonomic nervous system, in particular, neuronal nicotinic acetylcholine receptors, postsynaptic receptors with which the parasympathetic and sympathetic neurotransmitters, acetylcholine (ACh) and norepinephrine, respectively, interact. The receptor subtypes that mediate these responses, M(1-3) muscarinic cholinergic receptors (mAChRs), and alpha(1A,B,D)-, alpha(2A,B,C)-, and beta(1,2,3)-adrenergic receptors (AR), show highly variable expression of genetic variants; variants of mAChRs and alpha(1)-ARs are relatively rare, whereas alpha(2)-AR and beta-AR subtype variants are quite common. The largest amount of data is available regarding variants of the latter ARs and represents efforts to associate certain receptor genotypes, most commonly, single nucleotide polymorphisms, with particular phenotypes (e.g., cardiovascular and metabolic responses). In vitro and in vivo studies have yielded inconsistent results; definitive conclusions are limited. We identify several conceptual and methodological problems with available data: sample size, ethnicity, tissue differences, coding versus noncoding variants, limited studies of haplotypes, and interaction among variants. Thus, although progress has been made in identifying genetic variation that influences drug response fo autonomic nervous system components, we are still at the early stages of defining the most critical genetic determinants and their role in human physiology and pharmacology.
Collapse
Affiliation(s)
- Shelli L Kirstein
- Department of Pharmacology, University of California, San Diego, 9500 Gilman D., 0636, La Jolla, CA 92093-0636, USA
| | | |
Collapse
|
381
|
Bonilla C, Parra EJ, Pfaff CL, Dios S, Marshall JA, Hamman RF, Ferrell RE, Hoggart CL, McKeigue PM, Shriver MD. Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping. Ann Hum Genet 2004; 68:139-53. [PMID: 15008793 DOI: 10.1046/j.1529-8817.2003.00084.x] [Citation(s) in RCA: 121] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Hispanic populations are a valuable resource that can and should facilitate the identification of complex trait genes by means of admixture mapping (AM). In this paper we focus on a particular Hispanic population living in the San Luis Valley (SLV) in Southern Colorado. We used a set of 22 Ancestry Informative Markers (AIMs) to describe the admixture process and dynamics in this population. AIMs are defined as genetic markers that exhibit allele frequency differences between parental populations >or=30%, and are more informative for studying admixed populations than random markers. The ancestral proportions of the SLV Hispanic population are estimated as 62.7 +/- 2.1% European, 34.1 +/- 1.9% Native American and 3.2 +/- 1.5% West African. We also estimated the ancestral proportions of individuals using these AIMs. Population structure was demonstrated by the excess association of unlinked markers, the correlation between estimates of admixture based on unlinked marker sets, and by a highly significant correlation between individual Native American ancestry and skin pigmentation (R2= 0.082, p < 0.001). We discuss the implications of these findings in disease gene mapping efforts.
Collapse
Affiliation(s)
- C Bonilla
- National Human Genome Center, Howard University, Washington, DC 20060, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
382
|
Hinds DA, Stokowski RP, Patil N, Konvicka K, Kershenobich D, Cox DR, Ballinger DG. Matching strategies for genetic association studies in structured populations. Am J Hum Genet 2004; 74:317-25. [PMID: 14740319 PMCID: PMC1181929 DOI: 10.1086/381716] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2003] [Accepted: 11/26/2003] [Indexed: 11/03/2022] Open
Abstract
Association studies in populations that are genetically heterogeneous can yield large numbers of spurious associations if population subgroups are unequally represented among cases and controls. This problem is particularly acute for studies involving pooled genotyping of very large numbers of single-nucleotide-polymorphism (SNP) markers, because most methods for analysis of association in structured populations require individual genotyping data. In this study, we present several strategies for matching case and control pools to have similar genetic compositions, based on ancestry information inferred from genotype data for approximately 300 SNPs tiled on an oligonucleotide-based genotyping array. We also discuss methods for measuring the impact of population stratification on an association study. Results for an admixed population and a phenotype strongly confounded with ancestry show that these simple matching strategies can effectively mitigate the impact of population stratification.
Collapse
|
383
|
Tian XL, Kadaba R, You SA, Liu M, Timur AA, Yang L, Chen Q, Szafranski P, Rao S, Wu L, Housman DE, DiCorleto PE, Driscoll DJ, Borrow J, Wang Q. Identification of an angiogenic factor that when mutated causes susceptibility to Klippel–Trenaunay syndrome. Nature 2004; 427:640-5. [PMID: 14961121 PMCID: PMC1618873 DOI: 10.1038/nature02320] [Citation(s) in RCA: 213] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2003] [Accepted: 12/23/2003] [Indexed: 01/21/2023]
Abstract
Angiogenic factors are critical to the initiation of angiogenesis and maintenance of the vascular network. Here we use human genetics as an approach to identify an angiogenic factor, VG5Q, and further define two genetic defects of VG5Q in patients with the vascular disease Klippel-Trenaunay syndrome (KTS). One mutation is chromosomal translocation t(5;11), which increases VG5Q transcription. The second is mutation E133K identified in five KTS patients, but not in 200 matched controls. VG5Q protein acts as a potent angiogenic factor in promoting angiogenesis, and suppression of VG5Q expression inhibits vessel formation. E133K is a functional mutation that substantially enhances the angiogenic effect of VG5Q. VG5Q shows strong expression in blood vessels and is secreted as vessel formation is initiated. VG5Q can bind to endothelial cells and promote cell proliferation, suggesting that it may act in an autocrine fashion. We also demonstrate a direct interaction of VG5Q with another secreted angiogenic factor, TWEAK (also known as TNFSF12). These results define VG5Q as an angiogenic factor, establish VG5Q as a susceptibility gene for KTS, and show that increased angiogenesis is a molecular pathogenic mechanism of KTS.
Collapse
Affiliation(s)
- Xiao-Li Tian
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | - Rajkumar Kadaba
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | - Sun-Ah You
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | - Mugen Liu
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
- Institute of Genetics, Fudan University, Shanghai 200433, China
| | - Ayse Anil Timur
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | | | - Qiuyun Chen
- Cole Eye Institute, The Cleveland Clinic Foundation, Cleveland, Ohio 44195, USA
| | | | - Shaoqi Rao
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | - Ling Wu
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
| | - David E. Housman
- Center for Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | - David J. Driscoll
- Division of Pediatric Cardiology, Mayo Clinic, Rochester, Minnesota 55905, USA
| | - Julian Borrow
- Center for Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Qing Wang
- Center for Molecular Genetics, Department of Molecular Cardiology, Lerner Research Institute, and Center for Cardiovascular Genetics, Department of Cardiovascular Medicine, The Cleveland Clinic Foundation, and Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio 44195, USA
- Correspondence and requests for materials should be addressed to Q.W. (). The GenBank accession numbers are AY500994 for human VG5Q (hVG5Q) mRNA and amino acid sequences; AY500995 for mouse VG5Q (mVG5Q) mRNA and amino acid sequences; and AY500996 for human VG5Q genomic DNA sequence
| |
Collapse
|
384
|
Abstract
Family-based association studies have gained in popularity for mapping disease-susceptibility gene(s) of complex diseases. However, recruiting family controls is often more difficult than recruiting unrelated controls. The author proposes a case-control study, where the possible biases due to population stratification are controlled by matching in the design stage and by genomic controlling in the data-analytic stage. The matching is based on a set of "stratum-delineating variables," such as, race, ethnicity, nationality, ancestry, and birthplace; and the genomic controlling is based on typing a number of null markers across the genome and applying the principle of multiplicative scaling of chi-square distribution. It pays to match carefully to have a higher proportion of correctly matched sets, as computer simulation showed that this would increase the power of the study. If matching is crude, one loses power but still has the correct type I error rate after genomic controlling. Power studies showed that the numbers of affected subjects required for the pair-matched study are comparable to those required by the case-parents design, if the study was conducted in a homogeneous population. As the (control-to-case) matching ratio increases, the number of affected subjects required decreases. With matching ratio tending toward infinity, the number required shrinks roughly by half. The case-control study with matching and genomic controlling frees us from family bondage, and the genetic problem as complicated as mapping genes can now be studied using simple epidemiologic methods.
Collapse
Affiliation(s)
- Wen-Chung Lee
- Graduate Institute of Epidemiology, College of Public Health, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
385
|
Rosenberg NA, Li LM, Ward R, Pritchard JK. Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 2003; 73:1402-22. [PMID: 14631557 PMCID: PMC1180403 DOI: 10.1086/380416] [Citation(s) in RCA: 471] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2003] [Accepted: 10/02/2003] [Indexed: 11/04/2022] Open
Abstract
Inference of individual ancestry is useful in various applications, such as admixture mapping and structured-association mapping. Using information-theoretic principles, we introduce a general measure, the informativeness for assignment (I(n)), applicable to any number of potential source populations, for determining the amount of information that multiallelic markers provide about individual ancestry. In a worldwide human microsatellite data set, we identify markers of highest informativeness for inference of regional ancestry and for inference of population ancestry within regions; these markers, which are listed in online-only tables in our article, can be useful both in testing for and in controlling the influence of ancestry on case-control genetic association studies. Markers that are informative in one collection of source populations are generally informative in others. Informativeness of random dinucleotides, the most informative class of microsatellites, is five to eight times that of random single-nucleotide polymorphisms (SNPs), but 2%-12% of SNPs have higher informativeness than the median for dinucleotides. Our results can aid in decisions about the type, quantity, and specific choice of markers for use in studies of ancestry.
Collapse
Affiliation(s)
- Noah A Rosenberg
- Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
| | | | | | | |
Collapse
|
386
|
Yonan AL, Palmer AA, Smith KC, Feldman I, Lee HK, Yonan JM, Fischer SG, Pavlidis P, Gilliam TC. Bioinformatic analysis of autism positional candidate genes using biological databases and computational gene network prediction. GENES BRAIN AND BEHAVIOR 2003; 2:303-20. [PMID: 14606695 DOI: 10.1034/j.1601-183x.2003.00041.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Common genetic disorders are believed to arise from the combined effects of multiple inherited genetic variants acting in concert with environmental factors, such that any given DNA sequence variant may have only a marginal effect on disease outcome. As a consequence, the correlation between disease status and any given DNA marker allele in a genomewide linkage study tends to be relatively weak and the implicated regions typically encompass hundreds of positional candidate genes. Therefore, new strategies are needed to parse relatively large sets of 'positional' candidate genes in search of actual disease-related gene variants. Here we use biological databases to identify 383 positional candidate genes predicted by genomewide genetic linkage analysis of a large set of families, each with two or more members diagnosed with autism, or autism spectrum disorder (ASD). Next, we seek to identify a subset of biologically meaningful, high priority candidates. The strategy is to select autism candidate genes based on prior genetic evidence from the allelic association literature to query the known transcripts within the 1-LOD (logarithm of the odds) support interval for each region. We use recently developed bioinformatic programs that automatically search the biological literature to predict pathways of interacting genes (PATHWAYASSIST and GENEWAYS). To identify gene regulatory networks, we search for coexpression between candidate genes and positional candidates. The studies are intended both to inform studies of autism, and to illustrate and explore the increasing potential of bioinformatic approaches as a compliment to linkage analysis.
Collapse
Affiliation(s)
- A L Yonan
- Columbia Genome Center, Columbia University, New York, NY 10032, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
387
|
Page GP, George V, Go RC, Page PZ, Allison DB. "Are we there yet?": Deciding when one has demonstrated specific genetic causation in complex diseases and quantitative traits. Am J Hum Genet 2003; 73:711-9. [PMID: 13680525 PMCID: PMC1180596 DOI: 10.1086/378900] [Citation(s) in RCA: 152] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2003] [Accepted: 08/06/2003] [Indexed: 01/08/2023] Open
Abstract
Although mathematical relationships can be proven by deductive logic, biological relationships can only be inferred from empirical observations. This is a distinct disadvantage for those of us who strive to identify the genes involved in complex diseases and quantitative traits. If causation cannot be proven, however, what does constitute sufficient evidence for causation? The philosopher Karl Popper said, "Our belief in a hypothesis can have no stronger basis than our repeated unsuccessful critical attempts to refute it." We believe that to establish causation, as scientists, we must make a serious attempt to refute our own hypotheses and to eliminate all known sources of bias before association becomes causation. In addition, we suggest that investigators must provide sufficient data and evidence of their unsuccessful efforts to find any confounding biases. In this editorial, we discuss what "causation" means in the context of complex diseases and quantitative traits, and we suggest guidelines for steps that may be taken to address possible confounders of association before polymorphisms may be called "causative."
Collapse
Affiliation(s)
- Grier P. Page
- Section on Statistical Genetics, Department of Biostatistics, Division of Rheumatology, Department of Medicine, Departments of Epidemiology and Genetics, and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham
| | - Varghese George
- Section on Statistical Genetics, Department of Biostatistics, Division of Rheumatology, Department of Medicine, Departments of Epidemiology and Genetics, and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham
| | - Rodney C. Go
- Section on Statistical Genetics, Department of Biostatistics, Division of Rheumatology, Department of Medicine, Departments of Epidemiology and Genetics, and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham
| | - Patricia Z. Page
- Section on Statistical Genetics, Department of Biostatistics, Division of Rheumatology, Department of Medicine, Departments of Epidemiology and Genetics, and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham
| | - David B. Allison
- Section on Statistical Genetics, Department of Biostatistics, Division of Rheumatology, Department of Medicine, Departments of Epidemiology and Genetics, and Clinical Nutrition Research Center, University of Alabama at Birmingham, Birmingham
| |
Collapse
|
388
|
Abstract
Population substructure and recent admixture may confound the results of genetic association studies in unrelated individuals, leading to a potential excess of both false positive and false negative results. The possibility of false associations depends on the population sampled, the trait being studied and the marker being tested. Although family based tests of association avoid the possibility of confounding due to population substructure and admixture, association studies in unrelated individuals may be preferred in many situations due to their feasibility. Unlinked genetic markers may be used to detect confounding in association studies. In addition, the information from unlinked markers may be used to adjust genetic associations.
Collapse
Affiliation(s)
- Elad Ziv
- Division of General Internal Medicine and Division of Pulmonary Medicine, Department of Medicine, University of California, San Francisco, CA 94115, USA.
| | | |
Collapse
|