1
|
Abstract
A haplotype is a string of nucleotides or alleles at nearby loci on one chromosome, usually inherited as a unit. Within the major histocompatibility complex (MHC) region on human chromosome 6p, independent population studies of multiple families have identified conserved extended haplotypes (CEHs) that segregate as long stretches (≥1 megabase) of essentially identical DNA sequence at relatively high (≥0.5 %) population frequency ("genetic fixity"). CEHs were first identified through segregation analysis in the early 1980s. In European Caucasian populations, the most frequent 30 CEHs account for at least one-third of all MHC haplotypes. These CEHs provide all of the known individual MHC susceptibility and protective genetic markers within those populations for several complex genetic diseases. Haplotypes are rigorously determined directly by sequencing single chromosomes or by Mendelian segregation analysis using families with informative genotypes. Four parental haplotypes are assigned unambiguously using genotypes from the two parents and from two of their haploidentical (to each other) children. However, the most common current technique to phase haplotypes is probabilistic statistical imputation, using unrelated subjects. Such probabilistic techniques have failed to detect CEHs and are thus of questionable value in identifying long-range haplotype structure and, consequently, genetic structure-function relationships. Finally, with haplotypes rigorously defined, association studies can determine frequencies of alleles among unrelated patient haplotypes vs. those among only unaffected family members (i.e., control alleles/haplotypes). Such studies reduce, as much as possible, the confounding effects of population stratification common to all genetic studies.
Collapse
Affiliation(s)
- Chester A Alper
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, CLS_03, 3 Blackfan Circle, Boston, MA, 02115, USA.
- Department of Pediatrics, Harvard Medical School, 25 Shattuck Street, Boston, MA, 02115, USA.
| | - Charles E Larsen
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, CLS_03, 3 Blackfan Circle, Boston, MA, 02115, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA, 02115, USA
| |
Collapse
|
2
|
Greenberg DA, Stewart WCL. How should we be searching for genes for common epilepsy? A critique and a prescription. Epilepsia 2012; 53 Suppl 4:72-80. [PMID: 22946724 DOI: 10.1111/j.1528-1167.2012.03616.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Despite enormous data collection and analysis efforts, the genetic influences on common epilepsies remain mostly unknown. We propose that reasons for the lack of progress can be traced to three factors: (1) A reluctance to consider fine-grained phenotype definitions based on extensive and carefully collected clinical data; (2) the pursuit of genetic analysis methods that are popular but poorly conceived and are inadequate to the task of resolving the problems inherent in common disease studies; (3) preconceived ideas about the genetic mechanisms that cause epilepsy (which we have discussed elsewhere). We propose a paradigm for finding epilepsy-related loci and alleles that has proven successful in other common diseases.
Collapse
Affiliation(s)
- David A Greenberg
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital Research Institute, Columbus, Ohio 43215, USA.
| | | |
Collapse
|
3
|
Williams KY, Yoo YJ, Patki A, Allison DB. Real data examples in statistical methods papers: Tremendously valuable, and also tremendously misvalued. STATISTICS AND ITS INTERFACE 2011; 4:267-272. [PMID: 22132253 PMCID: PMC3225256 DOI: 10.4310/sii.2011.v4.n3.a1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
When a statistical methods paper is submitted to a journal for publication, examples in which the method is applied to real data are highly encouraged by many journals and in some cases are explicitly demanded. In this commentary, we argue that real data examples serve several useful purposes. However, we also argue that in many cases, particularly in the fields of genetics and genomics, there is an implicit or explicit expectation for examples to support purposes for which they are ill-suited and furthermore that these inappropriate expectations have negative consequences for the field. We conclude by noting that real data examples can be tremendously valuable and should continue to be used where appropriate, but that the demands for, expectations of, and conclusions drawn from them need to be scaled back.
Collapse
Affiliation(s)
- K. Y. Williams
- Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Yun Joo Yoo
- Department of Mathematics Education, Seoul National University, Kwanak-ro 1, Kwanak-ku, Seoul, South Korea
| | - Amit Patki
- Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - David B. Allison
- Corresponding author: Ryals Public Health Building, Suite 327, 1665 University Boulevard, Birmingham, Alabama 35294, USA, Tel: 205-975-9169, Fax: 205-975-2540.
| |
Collapse
|
4
|
Greenberg DA, Subaran R. Blinders, phenotype, and fashionable genetic analysis: a critical examination of the current state of epilepsy genetic studies. Epilepsia 2011; 52:1-9. [PMID: 21219301 PMCID: PMC3021750 DOI: 10.1111/j.1528-1167.2010.02734.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Although it is accepted that idiopathic generalized epilepsy (IGE) is strongly, if not exclusively, influenced by genetic factors, there is little consensus on what those genetic influences may be, except for one point of agreement: epilepsy is a "channelopathy." This point of agreement has continued despite the failure of studies investigating channel genes to demonstrate the primacy of their influence on IGE expression. The belief is sufficiently entrenched that the more important issues involving phenotype definition, data collection, methods of analysis, and the interpretation of results have become subordinate to it. The goal of this article is to spark discussion of where the study of epilepsy genetics has been and where it is going, suggesting we may never get there if we continue on the current road. We use the long history of psychiatric genetic studies as a mirror and starting point to illustrate that only when we expand our outlook on how to study the genetics of the epilepsies, consider other mechanisms that could lead to epilepsy susceptibility, and, especially, focus on the critical problem of phenotype definition, will the major influences on common epilepsy begin to be understood.
Collapse
Affiliation(s)
- David A Greenberg
- Division of Statistical Genetics, Department of Biostatistics, Mailman School of Public Health, New York State Psychiatric Institute, Columbia University Medical Center, New York, New York, USA.
| | | |
Collapse
|
5
|
Greenberg DA, Subaran R. Response to comments on the paper “Blinders, phenotype, and fashionable genetic analysis: A critical examination of the current state of epilepsy genetic studies”. Epilepsia 2011. [DOI: 10.1111/j.1528-1167.2010.02944.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
Pal DK. Fashions come and go. Epilepsia 2011; 52:191-2; discussion 193-6. [DOI: 10.1111/j.1528-1167.2010.02901.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Xu H, George V. Assessment of Population Structure and Its Effects on Genome-Wide Association Studies. COMMUN STAT-THEOR M 2009; 38:2843-2855. [PMID: 19777146 PMCID: PMC2748884 DOI: 10.1080/03610920902947188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. However, population structure is a potential problem, the effects of which on genetic association studies are controversial. Quantification of the effects of population structure on large scale genetic association studies is needed for valid analysis of data and correct interpretation of results. In this study, we performed extensive coalescent-based simulation study with varying levels of population structure to investigate the effects of population structure on large-scale genetic association studies. The effects of population structure are measured by the multiplicative changes of the probability of type I error, which is then correlated with the levels of population structure. It is found that at each nominal level of association tests, there is a positive relationship between the level of population structure and its effects, which could be summarized well with a regression function. It is also found that at a specific level of population structure, its effect on association study increases drastically as the significance level of the test decreases. The type I error is inflated by an amount approximately equal to Wright's F(ST), a measure that is used to quantify the magnitude of population structure. Therefore, in genome-wide association studies, the effects of population structure cannot be safely ignored, and must be accounted for with proper methods. This study provides quantitative guidelines to account for the effects of population structure on genome-wide association studies in admixed populations.
Collapse
Affiliation(s)
- Hongyan Xu
- Department of Biostatistics, Medical College of Georgia, Augusta, Georgia, USA
| | | |
Collapse
|
8
|
Kraemer HC, Shrout PE, Rubio-Stipec M. Developing the diagnostic and statistical manual V: what will "statistical" mean in DSM-V? Soc Psychiatry Psychiatr Epidemiol 2007; 42:259-67. [PMID: 17334899 DOI: 10.1007/s00127-007-0163-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2007] [Indexed: 10/23/2022]
Abstract
In February of 2004, the American Psychiatric Institute for Research and Education (APIRE) hosted a Launch and Methodology Conference to discuss the role statistics might play in the eventual revision of the Fourth Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) and the Ninth Edition of the International Classification of Diseases (ICD9). The conference consisted of talks on specific topics by statisticians and epidemiologists from North America and Great Britain, followed by group discussion by experts in nosology and psychopathology. We report here on the development of specific themes related to the future interaction between statisticians and nosologists in DSM-V development that arose as a result of that meeting. The themes are related to (1) the nature of the statistician/nosologist interaction; (2) specific areas of concern in that interaction, and (3) the use of novel and complex statistical methods to challenge and inspire new avenues of thinking among nosologists.
Collapse
Affiliation(s)
- Helena Chmura Kraemer
- Dept. of Psychiatry and Behavioral Sciences, Stanford University, 401 Quarry Road, MS 5717, Stanford, CA 94305, USA.
| | | | | |
Collapse
|
9
|
Fyer AJ, Hamilton SP, Durner M, Haghighi F, Heiman GA, Costa R, Evgrafov O, Adams P, de Leon AB, Taveras N, Klein DF, Hodge SE, Weissman MM, Knowles JA. A third-pass genome scan in panic disorder: evidence for multiple susceptibility loci. Biol Psychiatry 2006; 60:388-401. [PMID: 16919526 DOI: 10.1016/j.biopsych.2006.04.018] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/08/2006] [Revised: 04/24/2006] [Accepted: 04/25/2006] [Indexed: 01/20/2023]
Abstract
BACKGROUND Panic disorder (PD) is a common illness with a definite but "complex" genetic contribution and estimated heritability of 30-46%. METHODS We report a genome scan in 120 multiplex PD pedigrees consisting of 1591 individuals of whom 992 were genotyped with 371 markers at an average spacing of 9cM. Parametric two-point, multipoint, and nonparametric analyses were performed using three PD models (Broad, Intermediate, Narrow) and allowing for homogeneity or heterogeneity. The two-point analyses were also performed allowing for independent male and female recombination fractions (theta). Genome-wide significance was empirically evaluated using simulations of this dataset. RESULTS Evidence for linkage reached genome-wide significance in one region on chromosome 15q (near GABA-A receptor subunit genes) and was suggestive at loci on 2p, 2q and 9p using an averaged theta. Analyses allowing for sex-specific theta's were consistent except that support at one locus on 2q increased to genome-wide significance and an additional region of suggestive linkage on 12q was identified. However, differences in male and female recombination fractions predicted by the sex-specific approach were not consistent with current physical maps. CONCLUSIONS These data provide evidence for chromosomal regions on 15q and 2q that may be important in genetic susceptibility to panic disorder. Although we are encouraged by the findings of analyses using sex-specific recombination fractions, we also note that further understanding of this analytic strategy will be important.
Collapse
Affiliation(s)
- Abby J Fyer
- Department of Psychiatry, College of Physicians and Surgeons, Columbia University and New York State Psychiatric Institute, New York, New York 10032, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Brown CM, Rea TJ, Hamon SC, Hixson JE, Boerwinkle E, Clark AG, Sing CF. The contribution of individual and pairwise combinations of SNPs in the APOA1 and APOC3 genes to interindividual HDL-C variability. J Mol Med (Berl) 2006; 84:561-72. [PMID: 16705465 PMCID: PMC1698872 DOI: 10.1007/s00109-005-0037-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2005] [Accepted: 11/17/2005] [Indexed: 02/05/2023]
Abstract
Apolipoproteins (apo) A-I and C-III are components of high-density lipoprotein-cholesterol (HDL-C), a quantitative trait negatively correlated with risk of cardiovascular disease (CVD). We analyzed the contribution of individual and pairwise combinations of single nucleotide polymorphisms (SNPs) in the APOA1/APOC3 genes to HDL-C variability to evaluate (1) consistency of published single-SNP studies with our single-SNP analyses; (2) consistency of single-SNP and two-SNP phenotype-genotype relationships across race-, gender-, and geographical location-dependent contexts; and (3) the contribution of single SNPs and pairs of SNPs to variability beyond that explained by plasma apo A-I concentration. We analyzed 45 SNPs in 3,831 young African-American (N=1,858) and European-American (N=1,973) females and males ascertained by the Coronary Artery Risk Development in Young Adults (CARDIA) study. We found three SNPs that significantly impact HDL-C variability in both the literature and the CARDIA sample. Single-SNP analyses identified only one of five significant HDL-C SNP genotype relationships in the CARDIA study that was consistent across all race-, gender-, and geographical location-dependent contexts. The other four were consistent across geographical locations for a particular race-gender context. The portion of total phenotypic variance explained by single-SNP genotypes and genotypes defined by pairs of SNPs was less than 3%, an amount that is miniscule compared to the contribution explained by variability in plasma apo A-I concentration. Our findings illustrate the impact of context-dependence on SNP selection for prediction of CVD risk factor variability.
Collapse
Affiliation(s)
- C. M. Brown
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - T. J. Rea
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - S. C. Hamon
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10021, USA
| | - J. E. Hixson
- Human Genetics Center, University of Texas Health Science Center, Houston, TX 77030, USA
| | - E. Boerwinkle
- Human Genetics Center, University of Texas Health Science Center, Houston, TX 77030, USA
| | - A. G. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - C. F. Sing
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
11
|
Alper CA, Larsen CE, Dubey DP, Awdeh ZL, Fici DA, Yunis EJ. The Haplotype Structure of the Human Major Histocompatibility Complex. Hum Immunol 2006; 67:73-84. [PMID: 16698428 DOI: 10.1016/j.humimm.2005.11.006] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2005] [Revised: 11/17/2005] [Accepted: 11/22/2005] [Indexed: 11/17/2022]
Abstract
There is great interest in the use of single-nucleotide polymorphisms (SNPs) and linkage disequilibrium (LD) analysis to localize human disease genes. The results suggest that the human genome, including the major histocompatibility complex (MHC), consists largely of 5- to 200-kb blocks of sequence fixity between which random recombination occurs. Direct determination of MHC haplotypes from family studies also demonstrates similar-sized blocks, but otherwise gives a very different picture, with a third to a half of Caucasian haplotypes fixed from HLA-B to HLA-DR/DQ (at least 1 Mb) as conserved extended haplotypes (CEHs), some of which encompass more than 3 Mb. These fixed haplotypes differ in frequency both in different Caucasian subpopulations and in Caucasian patients with HLA-associated diseases, complicating disease susceptibility gene localization. The inherent inability of LD analysis to "see" DNA fixity beyond three markers contributes to the failure of SNP/LD analysis to define in detail or even detect CEHs in the MHC and probably elsewhere in the genome. More importantly, the use of statistical analysis, rather than direct haplotype determination and counting, fails to reveal the details of haplotype structure essential for gene localization. Given the oversimplified picture of the MHC (and probably the rest of the genome) provided only by SNP/LD-defined blocks, it is questionable whether this approach will be of great help in disease susceptibility gene localization or identification.
Collapse
Affiliation(s)
- Chester A Alper
- CBR Institute for Biomedical Research, and Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA.
| | | | | | | | | | | |
Collapse
|
12
|
Heiman GA, Hodge SE, Gorroochurn P, Zhang J, Greenberg DA. Effect of population stratification on case-control association studies. I. Elevation in false positive rates and comparison to confounding risk ratios (a simulation study). Hum Hered 2005; 58:30-9. [PMID: 15604562 DOI: 10.1159/000081454] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2004] [Accepted: 08/06/2004] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES This is the first of two articles discussing the effect of population stratification on the type I error rate (i.e., false positive rate). This paper focuses on the confounding risk ratio (CRR). It is accepted that population stratification (PS) can produce false positive results in case-control genetic association. However, which values of population parameters lead to an increase in type I error rate is unknown. Some believe PS does not represent a serious concern, whereas others believe that PS may contribute to contradictory findings in genetic association. We used computer simulations to estimate the effect of PS on type I error rate over a wide range of disease frequencies and marker allele frequencies, and we compared the observed type I error rate to the magnitude of the confounding risk ratio. METHODS We simulated two populations and mixed them to produce a combined population, specifying 160 different combinations of input parameters (disease prevalences and marker allele frequencies in the two populations). From the combined populations, we selected 5000 case-control datasets, each with either 50, 100, or 300 cases and controls, and determined the type I error rate. In all simulations, the marker allele and disease were independent (i.e., no association). RESULTS The type I error rate is not substantially affected by changes in the disease prevalence per se. We found that the CRR provides a relatively poor indicator of the magnitude of the increase in type I error rate. We also derived a simple mathematical quantity, Delta, that is highly correlated with the type I error rate. In the companion article (part II, in this issue), we extend this work to multiple subpopulations and unequal sampling proportions. CONCLUSION Based on these results, realistic combinations of disease prevalences and marker allele frequencies can substantially increase the probability of finding false evidence of marker disease associations. Furthermore, the CRR does not indicate when this will occur.
Collapse
Affiliation(s)
- Gary A Heiman
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY 10032, USA.
| | | | | | | | | |
Collapse
|
13
|
Abstract
This article reconsiders the issue of clinical versus statistical prediction. The term clinical is widely used to denote 1 pole of 2 independent axes: the observer whose data are being aggregated (clinician/expert vs. lay) and the method of aggregating those data (impressionistic vs. statistical). Fifty years of research suggests that when formulas are available, statistical aggregation outperforms informal, subjective aggregation much of the time. However, these data have little bearing on the question of whether, or under what conditions, clinicians can make reliable and valid observations and inferences at a level of generality relevant to practice or useful as data to be aggregated statistically. An emerging body of research suggests that clinical observations, just like lay observations, can be quantified using standard psychometric procedures, so that clinical description becomes statistical prediction.
Collapse
|
14
|
Mehta T, Tanik M, Allison DB. Towards sound epistemological foundations of statistical methods for high-dimensional biology. Nat Genet 2004; 36:943-7. [PMID: 15340433 DOI: 10.1038/ng1422] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2004] [Accepted: 08/09/2004] [Indexed: 11/10/2022]
Abstract
A sound epistemological foundation for biological inquiry comes, in part, from application of valid statistical procedures. This tenet is widely appreciated by scientists studying the new realm of high-dimensional biology, or 'omic' research, which involves multiplicity at unprecedented scales. Many papers aimed at the high-dimensional biology community describe the development or application of statistical techniques. The validity of many of these is questionable, and a shared understanding about the epistemological foundations of the statistical methods themselves seems to be lacking. Here we offer a framework in which the epistemological foundation of proposed statistical methods can be evaluated.
Collapse
Affiliation(s)
- Tapan Mehta
- Department of Biostatistics, Section on Statistical Genetics, Ryals Public Health Building, Suite 327, University of Alabama at Birmingham, 1665 University Boulevard, Birmingham, Alabama 35294, USA
| | | | | |
Collapse
|
15
|
Gelernter J, Liu X, Hesselbrock V, Page GP, Goddard A, Zhang H. Results of a genomewide linkage scan: support for chromosomes 9 and 11 loci increasing risk for cigarette smoking. Am J Med Genet B Neuropsychiatr Genet 2004; 128B:94-101. [PMID: 15211640 DOI: 10.1002/ajmg.b.30019] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
UNLABELLED Cigarette smoking is highly destructive to individuals and society, and is moderately heritable. We completed a genomewide linkage scan to map loci increasing risk for cigarette smoking in a set of families originally identified because they segregate panic disorder (PD). One hundred forty two genotyped individuals in a total of 12 families were studied (214 subjects analyzed, including non-genotyped individuals). Of these individuals, 69 were "affected" with habitual cigarette smoking (i.e., they smoked more than one pack per day for at least a year, or at least 1/2 pack per day for at least 10 years), 49 were "unaffected" (i.e., they smoked less than 1/2 pack per day for less than 1 year), and 24 were scored as "unknown." Nine families from the panic series were excluded from these analyses because they lacked multiple affected individuals with habitual cigarette smoking. In an initial genomewide scan, we genotyped a total of 416 markers (398 autosomal, 18 X-chromosome) with an average spacing of less than 10 cM, spanning the genome. Linkage analysis (pairwise, or single-point, and multi-point) was performed using ALLEGRO. An additional 14 markers were genotyped in a high-density panel to follow-up on an identified region of interest on chromosome 11p. The three highest multi-point Zlr scores (3.43, 3.04, and 3.01; P = 0.0003, P = 0.0012, and P = 0.0013, respectively), which each reflect "suggestive" evidence for linkage, were observed in multi-point linkage analyses using Allegro on chromosomes 11p and 9, near markers D11S4046, D9S283, and D9S1677, respectively. D11S4046 is in a region where linkage to alcohol dependence and linkage disequilibrium to substance dependence have previously been identified. The chromosome 9 region we identified as possibly linked to cigarette smoking in anxiety families, was previously identified as significantly linked to PD in Icelandic pedigrees. We also identified evidence supporting linkage (Zlr score > 2.3, P < 0.01) to regions of chromosomes 14, 16, and X. There was a significant phenotypic association between PD and cigarette smoking (P < 0.001). CONCLUSIONS We identified evidence for two loci increasing risk for cigarette smoking that map to chromosomes 9 and 11. There is now evidence supporting linkage or association of chromosome 11 markers with alcohol dependence, illegal drug abuse and dependence, and cigarette smoking. Interestingly, one of our most promising linkage regions, includes a region previously identified as linked to PD.
Collapse
Affiliation(s)
- Joel Gelernter
- Department of Psychiatry, Yale University School of Medicine, VA Psychiatry 116A2, 950 Campbell Avenue, West Haven, CT 06516, USA.
| | | | | | | | | | | |
Collapse
|
16
|
Marchini J, Cardon LR, Phillips MS, Donnelly P. The effects of human population structure on large genetic association studies. Nat Genet 2004; 36:512-7. [PMID: 15052271 DOI: 10.1038/ng1337] [Citation(s) in RCA: 596] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2003] [Accepted: 03/12/2004] [Indexed: 01/21/2023]
Abstract
Large-scale association studies hold substantial promise for unraveling the genetic basis of common human diseases. A well-known problem with such studies is the presence of undetected population structure, which can lead to both false positive results and failures to detect genuine associations. Here we examine approximately 15,000 genome-wide single-nucleotide polymorphisms typed in three population groups to assess the consequences of population structure on the coming generation of association studies. The consequences of population structure on association outcomes increase markedly with sample size. For the size of study needed to detect typical genetic effects in common diseases, even the modest levels of population structure within population groups cannot safely be ignored. We also examine one method for correcting for population structure (Genomic Control). Although it often performs well, it may not correct for structure if too few loci are used and may overcorrect in other settings, leading to substantial loss of power. The results of our analysis can guide the design of large-scale association studies.
Collapse
Affiliation(s)
- Jonathan Marchini
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
| | | | | | | |
Collapse
|
17
|
|
18
|
Huizinga TWJ, Pisetsky DS, Kimberly RP. Associations, populations, and the truth: Recommendations for genetic association studies inArthritis & Rheumatism. ACTA ACUST UNITED AC 2004; 50:2066-71. [PMID: 15248203 DOI: 10.1002/art.20360] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
19
|
Farrall M. Reports of the death of the epistasis model are greatly exaggerated. Am J Hum Genet 2003; 73:1467-8; author reply 1471-3. [PMID: 14655098 PMCID: PMC1180411 DOI: 10.1086/380310] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
|
20
|
Tahri-Daizadeh N, Tregouet DA, Nicaud V, Manuel N, Cambien F, Tiret L. Automated detection of informative combined effects in genetic association studies of complex traits. Genome Res 2003; 13:1952-60. [PMID: 12902385 PMCID: PMC403788 DOI: 10.1101/gr.1254203] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
There is a growing body of evidence suggesting that the relationships between gene variability and common disease are more complex than initially thought and require the exploration of the whole polymorphism of candidate genes as well as several genes belonging to biological pathways. When the number of polymorphisms is relatively large and the structure of the relationships among them complex, the use of data mining tools to extract the relevant information is a necessity. Here, we propose an automated method for the detection of informative combined effects (DICE) among several polymorphisms (and nongenetic covariates) within the framework of association studies. The algorithm combines the advantages of the regressive approaches with those of data exploration tools. Importantly, DICE considers the problem of interaction between polymorphisms as an effect of interest and not as a nuisance effect. We illustrate the method with three applications on the relationship between (1). the P-selectin gene and myocardial infarction, (2). the cholesteryl ester transfer protein gene and plasma high-density-lipoprotein cholesterol concentration, and (3). genes of the renin-angiotensin-aldosterone system and myocardial infarction. The applications demonstrated that the method was able to recover results already found using other approaches, but in addition detected biologically sensible effects not previously described.
Collapse
Affiliation(s)
- Nadia Tahri-Daizadeh
- INSERM U525, Faculté de Médecine, Hôpital Pitié-Salpêtrière, 75634 Paris, France
| | | | | | | | | | | |
Collapse
|