1
|
Selecting Closely-Linked SNPs Based on Local Epistatic Effects for Haplotype Construction Improves Power of Association Mapping. G3-GENES GENOMES GENETICS 2019; 9:4115-4126. [PMID: 31604824 PMCID: PMC6893203 DOI: 10.1534/g3.119.400451] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Genome-wide association studies (GWAS) have gained central importance for the identification of candidate loci underlying complex traits. Single nucleotide polymorphism (SNP) markers are mostly used as genetic variants for the analysis of genotype-phenotype associations in populations, but closely linked SNPs that are grouped into haplotypes are also exploited. The benefit of haplotype-based GWAS approaches vs. SNP-based approaches is still under debate because SNPs in high linkage disequilibrium provide redundant information. To overcome some constraints of the commonly-used haplotype-based GWAS in which only consecutive SNPs are considered for haplotype construction, we propose a new method called functional haplotype-based GWAS (FH GWAS). FH GWAS is featured by combining SNPs into haplotypes based on the additive and epistatic effects among SNPs. Such haplotypes were termed functional haplotypes (FH). As shown by simulation studies, the FH GWAS approach clearly outperformed the SNP-based approach unless the minor allele frequency of the SNPs making up the haplotypes is low and the linkage disequilibrium between them is high. Applying FH GWAS for the trait flowering time in a large Arabidopsis thaliana population with whole-genome sequencing data revealed its potential empirically. FH GWAS identified all candidate regions which were detected in SNP-based and two other haplotype-based GWAS approaches. In addition, a novel region on chromosome 4 was solely detected by FH GWAS. Thus both the results of our simulation and empirical studies demonstrate that FH GWAS is a promising method and superior to the SNP-based approach even if almost complete genotype information is available.
Collapse
|
2
|
Camp NJ, Lin WY, Bigelow A, Burghel GJ, Mosbruger TL, Parry MA, Waller RG, Rigas SH, Tai PY, Berrett K, Rajamanickam V, Cosby R, Brock IW, Jones B, Connley D, Sargent R, Wang G, Factor RE, Bernard PS, Cannon-Albright L, Knight S, Abo R, Werner TL, Reed MWR, Gertz J, Cox A. Discordant Haplotype Sequencing Identifies Functional Variants at the 2q33 Breast Cancer Risk Locus. Cancer Res 2016; 76:1916-25. [PMID: 26795348 PMCID: PMC4873429 DOI: 10.1158/0008-5472.can-15-1629] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 12/31/2015] [Indexed: 12/30/2022]
Abstract
The findings from genome-wide association studies hold enormous potential for novel insight into disease mechanisms. A major challenge in the field is to map these low-risk association signals to their underlying functional sequence variants (FSV). Simple sequence study designs are insufficient, as the vast numbers of statistically comparable variants and a limited knowledge of noncoding regulatory elements complicate prioritization. Furthermore, large sample sizes are typically required for adequate power to identify the initial association signals. One important question is whether similar sample sizes need to be sequenced to identify the FSVs. Here, we present a proof-of-principle example of an extreme discordant design to map FSVs within the 2q33 low-risk breast cancer locus. Our approach employed DNA sequencing of a small number of discordant haplotypes to efficiently identify candidate FSVs. Our results were consistent with those from a 2,000-fold larger, traditional imputation-based fine-mapping study. To prioritize further, we used expression-quantitative trait locus analysis of RNA sequencing from breast tissues, gene regulation annotations from the ENCODE consortium, and functional assays for differential enhancer activities. Notably, we implicate three regulatory variants at 2q33 that target CASP8 (rs3769823, rs3769821 in CASP8, and rs10197246 in ALS2CR12) as functionally relevant. We conclude that nested discordant haplotype sequencing is a promising approach to aid mapping of low-risk association loci. The ability to include more efficient sequencing designs into mapping efforts presents an opportunity for the field to capitalize on the potential of association loci and accelerate translation of association signals to their underlying FSVs. Cancer Res; 76(7); 1916-25. ©2016 AACR.
Collapse
Affiliation(s)
- Nicola J Camp
- University of Utah School of Medicine, Salt Lake City, Utah.
| | - Wei-Yu Lin
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Alex Bigelow
- University of Utah School of Medicine, Salt Lake City, Utah. University of Utah School of Computing, Salt Lake City, Utah
| | - George J Burghel
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | | | - Marina A Parry
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | | | - Sushilaben H Rigas
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Pei-Yi Tai
- University of Utah School of Medicine, Salt Lake City, Utah
| | | | | | - Rachel Cosby
- University of Utah School of Medicine, Salt Lake City, Utah
| | - Ian W Brock
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Brandt Jones
- University of Utah School of Medicine, Salt Lake City, Utah
| | - Dan Connley
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Robert Sargent
- University of Utah School of Medicine, Salt Lake City, Utah
| | - Guoying Wang
- University of Utah School of Medicine, Salt Lake City, Utah
| | | | | | | | - Stacey Knight
- University of Utah School of Medicine, Salt Lake City, Utah
| | - Ryan Abo
- University of Utah School of Medicine, Salt Lake City, Utah
| | | | - Malcolm W R Reed
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| | - Jason Gertz
- University of Utah School of Medicine, Salt Lake City, Utah
| | - Angela Cox
- Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
3
|
Knüppel S, Meidtner K, Arregui M, Holzhütter HG, Boeing H. Joint Effect of Unlinked Genotypes: Application to Type 2 Diabetes in the EPIC-Potsdam Case-Cohort Study. Ann Hum Genet 2015; 79:253-63. [DOI: 10.1111/ahg.12115] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Revised: 03/04/2015] [Accepted: 02/24/2015] [Indexed: 02/03/2023]
Affiliation(s)
- Sven Knüppel
- Department of Epidemiology; German Institute of Human Nutrition Potsdam-Rehbruecke; 14558 Nuthetal Germany
| | - Karina Meidtner
- Department of Molecular Epidemiology; German Institute of Human Nutrition Potsdam-Rehbruecke; 14558 Nuthetal Germany
| | - Maria Arregui
- Research Group Cardiovascular Epidemiology; German Institute of Human Nutrition Potsdam-Rehbruecke; 14558 Nuthetal Germany
| | | | - Heiner Boeing
- Department of Epidemiology; German Institute of Human Nutrition Potsdam-Rehbruecke; 14558 Nuthetal Germany
| |
Collapse
|
4
|
Knüppel S, Rohde K, Meidtner K, Drogan D, Holzhütter HG, Boeing H, Fisher E. Evaluation of 41 candidate gene variants for obesity in the EPIC-Potsdam cohort by multi-locus stepwise regression. PLoS One 2013; 8:e68941. [PMID: 23874820 PMCID: PMC3709896 DOI: 10.1371/journal.pone.0068941] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2012] [Accepted: 06/04/2013] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVE Obesity has become a leading preventable cause of morbidity and mortality in many parts of the world. It is thought to originate from multiple genetic and environmental determinants. The aim of the current study was to introduce haplotype-based multi-locus stepwise regression (MSR) as a method to investigate combinations of unlinked single nucleotide polymorphisms (SNPs) for obesity phenotypes. METHODS In 2,122 healthy randomly selected men and women of the EPIC-Potsdam cohort, the association between 41 SNPs from 18 obesity-candidate genes and either body mass index (BMI, mean=25.9 kg/m(2), SD=4.1) or waist circumference (WC, mean=85.2 cm, SD=12.6) was assessed. Single SNP analyses were done by using linear regression adjusted for age, sex, and other covariates. Subsequently, MSR was applied to search for the 'best' SNP combinations. Combinations were selected according to specific AICc and p-value criteria. Model uncertainty was accounted for by a permutation test. RESULTS The strongest single SNP effects on BMI were found for TBC1D1 rs637797 (β = -0.33, SE=0.13), FTO rs9939609 (β=0.28, SE=0.13), MC4R rs17700144 (β=0.41, SE=0.15), and MC4R rs10871777 (β=0.34, SE=0.14). All these SNPs showed similar effects on waist circumference. The two 'best' six-SNP combinations for BMI (global p-value= 3.45⋅10(-6) and 6.82⋅10(-6)) showed effects ranging from -1.70 (SE=0.34) to 0.74 kg/m(2) (SE=0.21) per allele combination. We selected two six-SNP combinations on waist circumference (global p-value = 7.80⋅10(-6) and 9.76⋅10(-6)) with an allele combination effect of -2.96 cm (SE=0.76) at maximum. Additional adjustment for BMI revealed 15 three-SNP combinations (global p-values ranged from 3.09⋅10(-4) to 1.02⋅10(-2)). However, after carrying out the permutation test all SNP combinations lost significance indicating that the statistical associations might have occurred by chance. CONCLUSION MSR provides a tool to search for risk-related SNP combinations of common traits or diseases. However, the search process does not always find meaningful SNP combinations in a dataset.
Collapse
Affiliation(s)
- Sven Knüppel
- Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany.
| | | | | | | | | | | | | |
Collapse
|
5
|
Karinen S, Saarinen S, Lehtonen R, Rastas P, Vahteristo P, Aaltonen LA, Hautaniemi S. Rule-based induction method for haplotype comparison and identification of candidate disease loci. Genome Med 2012; 4:21. [PMID: 22429919 PMCID: PMC3446271 DOI: 10.1186/gm320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 03/19/2012] [Indexed: 11/21/2022] Open
Abstract
There is a need for methods that are able to identify rare variants that cause low or moderate penetrance disease susceptibility. To answer this need, we introduce a rule-based haplotype comparison method, Haplous, which identifies haplotypes within multiple samples from phased genotype data and compares them within and between sample groups. We demonstrate that Haplous is able to accurately identify haplotypes that are identical by descent, exclude common haplotypes in the studied population and select rare haplotypes from the data. Our analysis of three families with multiple individuals affected by lymphoma identified several interesting haplotypes shared by distantly related patients.
Collapse
Affiliation(s)
- Sirkku Karinen
- Research Programs Unit, Genome-Scale Biology, and Institute of Biomedicine, Biochemistry and Developmental Biology, University of Helsinki, Haartmaninkatu 8, Helsinki, FIN-00014, Finland.
| | | | | | | | | | | | | |
Collapse
|
6
|
Knüppel S, Esparza-Gordillo J, Marenholz I, Holzhütter HG, Bauerfeind A, Ruether A, Weidinger S, Lee YA, Rohde K. Multi-locus stepwise regression: a haplotype-based algorithm for finding genetic associations applied to atopic dermatitis. BMC MEDICAL GENETICS 2012; 13:8. [PMID: 22284537 PMCID: PMC3398269 DOI: 10.1186/1471-2350-13-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/24/2011] [Accepted: 01/27/2012] [Indexed: 11/11/2022]
Abstract
Background Genome-wide association studies (GWAS) provide an increasing number of single nucleotide polymorphisms (SNPs) associated with diseases. Our aim is to exploit those closely spaced SNPs in candidate regions for a deeper analysis of association beyond single SNP analysis, combining the classical stepwise regression approach with haplotype analysis to identify risk haplotypes for complex diseases. Methods Our proposed multi-locus stepwise regression starts with an evaluation of all pair-wise SNP combinations and then extends each SNP combination stepwise by one SNP from the region, carrying out haplotype regression in each step. The best associated haplotype patterns are kept for the next step and must be corrected for multiple testing at the end. These haplotypes should also be replicated in an independent data set. We applied the method to a region of 259 SNPs from the epidermal differentiation complex (EDC) on chromosome 1q21 of a German GWAS using a case control set (1,914 individuals) and to 268 families with at least two affected children as replication. Results A 4-SNP haplotype pattern with high statistical significance in the case control set (p = 4.13 × 10-7 after Bonferroni correction) could be identified which remained significant in the family set after Bonferroni correction (p = 0.0398). Further analysis revealed that this pattern reflects mainly the effect of the well-known FLG gene; however, a FLG-independent haplotype in case control set (OR = 1.71, 95% CI: 1.32-2.23, p = 5.6 × 10-5) and family set (OR = 1.68, 95% CI: 1.18-2.38, p = 2.19 × 10-3) could be found in addition. Conclusion Our approach is a useful tool for finding allele combinations associated with diseases beyond single SNP analysis in chromosomal candidate regions.
Collapse
Affiliation(s)
- Sven Knüppel
- Max Delbrück Center for Molecular Medicine Berlin-Buch, Berlin, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Camp NJ, Parry M, Knight S, Abo R, Elliott G, Rigas SH, Balasubramanian SP, Reed MWR, McBurney H, Latif A, Newman WG, Cannon-Albright LA, Evans DG, Cox A. Fine-mapping CASP8 risk variants in breast cancer. Cancer Epidemiol Biomarkers Prev 2011; 21:176-81. [PMID: 22056502 DOI: 10.1158/1055-9965.epi-11-0845] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Multiple genome-wide and candidate gene association studies have been conducted in search of common risk variants for breast cancer. Recent large meta analyses, consolidating evidence from these studies, have been consistent in highlighting the caspase-8 (CASP8) gene as important in this regard. To define a risk haplotype and map the CASP8 gene region with respect to underlying susceptibility variant/s, we screened four genes in the CASP8 region on 2q33-q34 for breast cancer risk. METHODS Two independent data sets from the United Kingdom and the United States, including 3,888 breast cancer cases and controls, were genotyped for 45 tagging single nucleotide polymorphisms (tSNP) in the expanded CASP8 region. SNP and haplotype association tests were carried out using Monte Carlo-based methods. RESULTS We identified a three-SNP haplotype across rs3834129, rs6723097, and rs3817578 that was significantly associated with breast cancer (P < 5 × 10(-6)), with a dominant risk ratio and 95% CI of 1.28 (1.21-1.35) and frequency of 0.29 in controls. Evidence for this risk haplotype was extremely consistent across the two study sites and also consistent with previous data. CONCLUSION This three-SNP risk haplotype represents the best characterization so far of the chromosome upon which the susceptibility variant resides. IMPACT Characterization of the risk haplotype provides a strong foundation for resequencing efforts to identify the underlying risk variant, which may prove useful for individual-level risk prediction, and provide novel insights into breast carcinogenesis.
Collapse
Affiliation(s)
- Nicola J Camp
- Division of Genetic Epidemiology, University of Utah School of Medicine, Salt Lake City, Utah, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Abo R, Wong J, Thomas A, Camp NJ. Haplotype association analyses in resources of mixed structure using Monte Carlo testing. BMC Bioinformatics 2010; 11:592. [PMID: 21143908 PMCID: PMC3016409 DOI: 10.1186/1471-2105-11-592] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 12/09/2010] [Indexed: 01/16/2023] Open
Abstract
Background Genomewide association studies have resulted in a great many genomic regions that are likely to harbor disease genes. Thorough interrogation of these specific regions is the logical next step, including regional haplotype studies to identify risk haplotypes upon which the underlying critical variants lie. Pedigrees ascertained for disease can be powerful for genetic analysis due to the cases being enriched for genetic disease. Here we present a Monte Carlo based method to perform haplotype association analysis. Our method, hapMC, allows for the analysis of full-length and sub-haplotypes, including imputation of missing data, in resources of nuclear families, general pedigrees, case-control data or mixtures thereof. Both traditional association statistics and transmission/disequilibrium statistics can be performed. The method includes a phasing algorithm that can be used in large pedigrees and optional use of pseudocontrols. Results Our new phasing algorithm substantially outperformed the standard expectation-maximization algorithm that is ignorant of pedigree structure, and hence is preferable for resources that include pedigree structure. Through simulation we show that our Monte Carlo procedure maintains the correct type 1 error rates for all resource types. Power comparisons suggest that transmission-disequilibrium statistics are superior for performing association in resources of only nuclear families. For mixed structure resources, however, the newly implemented pseudocontrol approach appears to be the best choice. Results also indicated the value of large high-risk pedigrees for association analysis, which, in the simulations considered, were comparable in power to case-control resources of the same sample size. Conclusions We propose hapMC as a valuable new tool to perform haplotype association analyses, particularly for resources of mixed structure. The availability of meta-association and haplotype-mining modules in our suite of Monte Carlo haplotype procedures adds further value to the approach.
Collapse
Affiliation(s)
- Ryan Abo
- Department of Biomedical Informatics, University of Utah, Salt Lake City, USA.
| | | | | | | |
Collapse
|
9
|
Curtin K, Wolff RK, Herrick JS, Abo R, Slattery ML. Exploring multilocus associations of inflammation genes and colorectal cancer risk using hapConstructor. BMC MEDICAL GENETICS 2010; 11:170. [PMID: 21129206 PMCID: PMC3006374 DOI: 10.1186/1471-2350-11-170] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2009] [Accepted: 12/03/2010] [Indexed: 02/05/2023]
Abstract
BACKGROUND In candidate-gene association studies of single nucleotide polymorphisms (SNPs), multilocus analyses are frequently of high dimensionality when considering haplotypes or haplotype pairs (diplotypes) and differing modes of expression. Often, while candidate genes are selected based on their biological involvement in a given pathway, little is known about the functionality of SNPs to guide association studies. Investigators face the challenge of exploring multiple SNP models to elucidate which variants, independently or in combination, might be associated with a disease of interest. A data mining module, hapConstructor (freely-available in Genie software) performs systematic construction and association testing of multilocus genotype data in a Monte Carlo framework. Our objective was to assess its utility to guide statistical analyses of haplotypes within a candidate region (or combined genotypes across candidate genes) beyond that offered by a standard logistic regression approach. METHODS We applied the hapConstructor method to a multilocus investigation of candidate genes involved in pro-inflammatory cytokine IL6 production, IKBKB, IL6, and NFKB1 (16 SNPs total) hypothesized to operate together to alter colorectal cancer risk. Data come from two U.S. multicenter studies, one of colon cancer (1,556 cases and 1,956 matched controls) and one of rectal cancer (754 cases and 959 matched controls). RESULTS hapConstructor enabled us to identify important associations that were further analyzed in logistic regression models to simultaneously adjust for confounders. The most significant finding (nominal P = 0.0004; false discovery rate q = 0.037) was a combined genotype association across IKBKB SNP rs5029748 (1 or 2 variant alleles), IL6 rs1800797 (1 or 2 variant alleles), and NFKB1 rs4648110 (2 variant alleles) which conferred an ~80% decreased risk of colon cancer. CONCLUSIONS Strengths of hapConstructor were: systematic identification of multiple loci within and across genes important in CRC risk; false discovery rate assessment; and efficient guidance of subsequent logistic regression analyses.
Collapse
Affiliation(s)
- Karen Curtin
- Epidemiology, Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, Utah, USA.
| | | | | | | | | |
Collapse
|
10
|
Abo R, Knight S, Thomas A, Camp NJ. Automated construction and testing of multi-locus gene-gene associations. Bioinformatics 2010; 27:134-6. [PMID: 21076150 PMCID: PMC3008644 DOI: 10.1093/bioinformatics/btq616] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Summary: It has been argued that the missing heritability in common diseases may be in part due to rare variants and gene–gene effects. Haplotype analyses provide more power for rare variants and joint analyses across genes can address multi-gene effects. Currently, methods are lacking to perform joint multi-locus association analyses across more than one gene/region. Here, we present a haplotype-mining gene–gene analysis method, which considers multi-locus data for two genes/regions simultaneously. This approach extends our single region haplotype-mining algorithm, hapConstructor, to two genes/regions. It allows construction of multi-locus SNP sets at both genes and tests joint gene–gene effects and interactions between single variants or haplotype combinations. A Monte Carlo framework is used to provide statistical significance assessment of the joint and interaction statistics, thus the method can also be used with related individuals. This tool provides a flexible data-mining approach to identifying gene–gene effects that otherwise is currently unavailable. Availability:http://bioinformatics.med.utah.edu/Genie/hapConstructor.html Contact:ryan.abo@hsc.utah.edu
Collapse
Affiliation(s)
- Ryan Abo
- Department of Biomedical Informatics, University of Utah School of Medicine, UT, USA.
| | | | | | | |
Collapse
|
11
|
Hauser E, Cremer N, Hein R, Deshmukh H. Haplotype-based analysis: a summary of GAW16 Group 4 analysis. Genet Epidemiol 2010; 33 Suppl 1:S24-8. [PMID: 19924718 DOI: 10.1002/gepi.20468] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
In this summary article, we describe the contributions included in the haplotype-based analysis group (Group 4) at the Genetic Analysis Workshop 16, which was held in September 17-20, 2008. Our group applied a large number of haplotype-based methods in the context of genome-wide association studies. Two general approaches were applied: a two-stage approach that selected significant single-nucleotide polymorphisms (SNPs) in the first stage and then created haplotypes in the second stage and genome-wide analysis of smaller sets of SNPs selected by sliding windows or estimating haplotype blocks. Genome-wide haplotype analyses performed in these ways were feasible. The presence of the very strong chromosome 6 association in the North American Rheumatoid Arthritis Consortium data was detected by every method, and additional analyses attempted to control for this strong result to allow detection of additional haplotype associations.
Collapse
Affiliation(s)
- Elizabeth Hauser
- Center for Human Genetics, Duke University, Durham, North Carolina 27710, USA.
| | | | | | | |
Collapse
|
12
|
Piccolo SR, Abo RP, Allen-Brady K, Camp NJ, Knight S, Anderson JL, Horne BD. Evaluation of genetic risk scores for lipid levels using genome-wide markers in the Framingham Heart Study. BMC Proc 2009; 3 Suppl 7:S46. [PMID: 20018038 PMCID: PMC2795945 DOI: 10.1186/1753-6561-3-s7-s46] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Multiple single-nucleotide polymorphisms have been associated with low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglyceride (TG) levels. In this paper, we evaluate a weighted and an unweighted approach for estimating the combined effect of multiple markers (using genotypes and haplotypes) on lipid levels for a given individual. METHODS Using data from the Framingham Heart Study SHARe genome-wide association study, we tested genome-wide genotypes and haplotypes for association with lipid levels and constructed genetic risk scores (GRS) based on multiple markers that were weighted according to their estimated effects on LDL-C, HDL-C, and TG. These scores (GRS-LDL, GRS-HDL, and GRS-TG) were then evaluated for associations with LDL-C, HDL-C, and TG, and compared with results of an unweighted method based on risk-allele counts. For comparability of metrics, GRS variables were divided into quartiles. RESULTS GRS-LDL quartiles were associated with LDL-C levels (p = 2.1 x 10-24), GRS-HDL quartiles with HDL-C (p = 5.9 x 10-22), and GRS-TG quartiles with TG (p = 5.4 x 10-25). In comparison, these p-values were considerably lower than those for the associations of the unweighted GRS quartiles for LDL-C (p = 3.6 x 10-7), HDL-C (p = 6.4 x 10-16), and TG (p = 4.1 x 10-10). CONCLUSION GRS variables were highly predictive of LDL-C, HDL-C, and TG measurements, especially when weighted based on each marker's individual association with those intermediate risk phenotypes. The allele-count GRS approach that does not weight the GRS by individual marker associations was considerably less predictive of lipid and lipoprotein measures when the same genetic markers were utilized, suggesting that substantially more risk-associated genetic marker information is encapsulated by the weighted GRS variables.
Collapse
Affiliation(s)
- Stephen R Piccolo
- Department of Biomedical Informatics, School of Medicine, University of Utah, 26 South 2000 East, Salt Lake City, Utah 84112, USA.
| | | | | | | | | | | | | |
Collapse
|
13
|
Curtin K, Lin WY, George R, Katory M, Shorto J, Cannon-Albright LA, Smith G, Bishop DT, Cox A, Camp NJ. Genetic variants in XRCC2: new insights into colorectal cancer tumorigenesis. Cancer Epidemiol Biomarkers Prev 2009; 18:2476-84. [PMID: 19690184 PMCID: PMC2742634 DOI: 10.1158/1055-9965.epi-09-0187] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Polymorphisms in DNA double-strand break repair gene XRCC2 may play an important role in colorectal cancer etiology, specifically in disease subtypes. Associations of XRCC2 variants and colorectal cancer were investigated by tumor site and tumor instability status in a four-center collaboration including three U.K. case-control studies (Sheffield, Leeds, and Dundee) and a U.S. case-control study of cases from high-risk Utah pedigrees (total: 1,252 cases and 1,422 controls). The 14 variants studied were tagging single nucleotide polymorphisms (SNP) selected from National Institute of Environmental Health Sciences/HapMap data supplemented with SNPs identified from sequencing of 125 cases chosen to represent multiple colorectal cancer groups (familial, metastatic disease, and tumor subsite). Monte Carlo significance testing using Genie software provided valid meta-analyses of the total resource that includes family-based data. Similar to reports of colorectal cancer and other cancer sites, the rs3218536 R188H allele was not associated with increased risk. However, we observed a novel, highly significant association of a common SNP, rs3218499G>C, with increased risk of rectal tumors (odds ratio, 2.1; 95% confidence interval, 1.3-3.3; P(chi2) = 0.0006) versus controls, with the largest risk found for female rectal cases (odds ratio, 3.1; 95% confidence interval, 1.6-6.1; P(chi2) = 0.0006). This difference was significantly different to that for proximal and distal colon cancers (P(chi2) = 0.02). Our investigation supports a role for XRCC2 in colorectal cancer tumorigenesis, conferring susceptibility to rectal tumors.
Collapse
Affiliation(s)
- Karen Curtin
- Genetic Epidemiology, University of Utah School of Medicine, 391 Chipeta Way Suite D2, Salt Lake City, UT 84108, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Shephard ND, Abo R, Rigas SH, Frank B, Lin WY, Brock IW, Shippen A, Balasubramanian SP, Reed MWR, Bartram CR, Meindl A, Schmutzler RK, Engel C, Burwinkel B, Cannon-Albright LA, Allen-Brady K, Camp NJ, Cox A. A breast cancer risk haplotype in the caspase-8 gene. Cancer Res 2009; 69:2724-8. [PMID: 19318553 PMCID: PMC2730164 DOI: 10.1158/0008-5472.can-08-4266] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recent large-scale studies have been successful in identifying common, low-penetrance variants associated with common cancers. One such variant in the caspase-8 (CASP8) gene, D302H (rs1045485), has been confirmed to be associated with breast cancer risk, although the functional effect of this polymorphism (if any) is not yet clear. In order to further map the CASP8 gene with respect to breast cancer susceptibility, we performed extensive haplotype analyses using single nucleotide polymorphisms (SNP) chosen to tag all common variations in the gene (tSNP). We used a staged study design based on 3,200 breast cancer and 3,324 control subjects from the United Kingdom, Utah, and Germany. Using a haplotype-mining algorithm in the UK cohort, we identified a four-SNP haplotype that was significantly associated with breast cancer and that was superior to any other single or multi-locus combination (P=8.0 x 10(-5)), with a per allele odds ratio and 95% confidence interval of 1.30 (1.12-1.49). The result remained significant after adjustment for the multiple testing inherent in mining techniques (false discovery rate, q=0.044). As expected, this haplotype includes the D302H locus. Multicenter analyses on a subset of the tSNPs yielded consistent results. This risk haplotype is likely to carry one or more underlying breast cancer susceptibility alleles, making it an excellent candidate for resequencing in homozygous individuals. An understanding of the mode of action of these alleles will aid risk assessment and may lead to the identification of novel treatment targets in breast cancer.
Collapse
Affiliation(s)
- Neil Duncan Shephard
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | - Ryan Abo
- Genetic Epidemiology, Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah 84108-1266, USA
| | - Sushila Harkisandas Rigas
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | - Bernd Frank
- Division of Molecular Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
| | - Wei-Yu Lin
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | - Ian Wallace Brock
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | - Adam Shippen
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | | | - Malcolm Walter Ronald Reed
- Academic Unit of Surgical Oncology, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| | | | - Alfons Meindl
- Department of Gynaecology and Obstetrics, Klinikum rechts der Isar, Technical University, Munich, Germany
| | - Rita Katharina Schmutzler
- Division of Molecular Gynaeco-Oncology, Department of Gynaecology and Obstetrics and Center of Molecular Medicine Cologne, University Hospital of Cologne, Germany
| | - Christoph Engel
- Institute for Medical Informatics, Statistics and Epidemiology (IMISE), Leipzig, Germany
| | - Barbara Burwinkel
- Division of Molecular Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Division Molecular Biology of Breast Cancer, Department of Gynaecology and Obstetrics, Heidelberg, Germany
| | - Lisa Anne Cannon-Albright
- Genetic Epidemiology, Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah 84108-1266, USA
| | - Kristina Allen-Brady
- Genetic Epidemiology, Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah 84108-1266, USA
| | - Nicola Jane Camp
- Genetic Epidemiology, Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah 84108-1266, USA
| | - Angela Cox
- Institute for Cancer Studies, School of Medicine and Biomedical Sciences, University of Sheffield, Sheffield S10 2RX, UK
| |
Collapse
|
15
|
Curtin K, Lin WY, George R, Katory M, Shorto J, Cannon-Albright LA, Bishop DT, Cox A, Camp NJ. Meta association of colorectal cancer confirms risk alleles at 8q24 and 18q21. Cancer Epidemiol Biomarkers Prev 2009; 18:616-21. [PMID: 19155440 PMCID: PMC2729170 DOI: 10.1158/1055-9965.epi-08-0690] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Genome-wide association studies of colorectal cancer (CRC) have identified genetic variants that reproducibly associate with CRC. Associations of 12 single nucleotide polymorphisms at 8q24, 9p24, and 18q21 (SMAD7) and CRC were investigated in a three-center collaborative study including two U.K. case-control cohorts (Sheffield and Leeds) and a U.S. case-control study of CRC cases from high-risk Utah pedigrees. METHODS Our combined resource included 1,092 CRC case subjects and 1,060 age- and sex-matched controls. Meta statistics and Monte Carlo significance testing using Genie software provided a valid combined analysis of our mixed independent and related case-control resource. We also evaluated whether these associations differed by sex, age at diagnosis, family history, or tumor site. RESULTS At 8q24, we observed two independent significant associations at single nucleotide polymorphisms located in two different risk regions of 8q24: rs6983267 in region 3 [P(trend) = 0.01; per allele odds ratio (OR), 1.17; 95% confidence intervals (95% CI), 1.03-1.32] and rs10090154 in region 5 (P(trend) = 0.05; per allele OR, 1.24; 95% CI, 1.01-1.51). At 18q21, associations were observed in distal colon tumors but not in proximal or rectal cancers: rs4939827 (P(trend) = 0.007; per allele OR, 0.77; 95% CI, 0.64-0.93; case-case p(diff) = 0.03) and rs12953717 (P(trend) = 0.01; per allele OR, 1.27; 95% CI, 1.06-1.52). We were unable to detect any associations at 9p24 with CRC. CONCLUSIONS Our investigation confirms that variants across multiple risk regions of 8q24 are associated with CRC, and that associations at 18q21 differ by tumor site.
Collapse
Affiliation(s)
- Karen Curtin
- Genetic Epidemiology, University of Utah School of Medicine, Salt Lake City, UT 84109, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|