201
|
Mirabello L, Yu K, Kraft P, De Vivo I, Hunter DJ, Prescott J, Wong JYY, Chatterjee N, Hayes RB, Savage SA. The association of telomere length and genetic variation in telomere biology genes. Hum Mutat 2010; 31:1050-8. [PMID: 20597107 DOI: 10.1002/humu.21314] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Telomeres cap chromosome ends and are critical for genomic stability. Many telomere-associated proteins are important for telomere length maintenance. Recent genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) in genes encoding telomere-associated proteins (RTEL1 and TERT-CLPTM1) as markers of cancer risk. We conducted an association study of telomere length and 743 SNPs in 43 telomere biology genes. Telomere length in peripheral blood DNA was determined by Q-PCR in 3,646 participants from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial and Nurses' Health Study. We investigated associations by SNP, gene, and pathway (functional group). We found no associations between telomere length and SNPs in TERT-CLPTM1L or RTEL1. Telomere length was not significantly associated with specific functional groups. Thirteen SNPs from four genes (MEN1, MRE11A, RECQL5, and TNKS) were significantly associated with telomere length. The strongest findings were in MEN1 (gene-based P=0.006), menin, which associates with the telomerase promoter and may negatively regulate telomerase. This large association study did not find strong associations with telomere length. The combination of limited diversity and evolutionary conservation suggest that these genes may be under selective pressure. More work is needed to explore the role of genetic variants in telomere length regulation.
Collapse
Affiliation(s)
- Lisa Mirabello
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland 20892, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
202
|
Wang K, Li M, Hakonarson H. Analysing biological pathways in genome-wide association studies. Nat Rev Genet 2010; 11:843-54. [PMID: 21085203 DOI: 10.1038/nrg2884] [Citation(s) in RCA: 581] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genome-wide association (GWA) studies have typically focused on the analysis of single markers, which often lacks the power to uncover the relatively small effect sizes conferred by most genetic variants. Recently, pathway-based approaches have been developed, which use prior biological knowledge on gene function to facilitate more powerful analysis of GWA study data sets. These approaches typically examine whether a group of related genes in the same functional pathway are jointly associated with a trait of interest. Here we review the development of pathway-based approaches for GWA studies, discuss their practical use and caveats, and suggest that pathway-based approaches may also be useful for future GWA studies with sequencing data.
Collapse
Affiliation(s)
- Kai Wang
- Center for Applied Genomics, The Childrens Hospital of Philadelphia, Pennsylvania 19104, USA
| | | | | |
Collapse
|
203
|
Freudenberg J, Lee AT, Siminovitch KA, Amos CI, Ballard D, Li W, Gregersen PK. Locus category based analysis of a large genome-wide association study of rheumatoid arthritis. Hum Mol Genet 2010; 19:3863-72. [PMID: 20639398 PMCID: PMC2935861 DOI: 10.1093/hmg/ddq304] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/13/2010] [Indexed: 11/14/2022] Open
Abstract
To pinpoint true positive single-nucleotide polymorphism (SNP) associations in a genome-wide association study (GWAS) of rheumatoid arthritis (RA), we categorize genetic loci by external knowledge. We test both the 'enrichment of associated loci' in a locus category and the 'combined association' of a locus category. The former is quantified by the odds ratio for the presence of SNP associations at the loci of a category, whereas the latter is quantified by the number of loci in a category that have SNP associations. These measures are compared with their expected values as obtained from the permutation of the affection status. To account for linkage disequilibrium (LD) among SNPs, we view each LD block as a genetic locus. Positional candidates were defined as loci implicated by earlier GWAS results, whereas functional candidates were defined by annotations regarding the molecular roles of genes, such as gene ontology categories. As expected, immune-related categories show the largest enrichment signal, although it is not very strong. The intersection of positional and functional candidate information predicts novel RA loci near the genes TEC/TXK, MBL2 and PIK3R1/CD180. Notably, a combined association signal is not only produced by immune-related categories, but also by most other categories and even randomly defined categories. The unspecific quality of these signals limits the possible conclusions from combined association tests. It also reduces the magnitude of enrichment test results. These unspecific signals might result from common variants of small effect and hardly concentrated in candidate categories, or an inflated size of associated regions from weak LD with infrequent mutations.
Collapse
Affiliation(s)
- Jan Freudenberg
- Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA.
| | | | | | | | | | | | | |
Collapse
|
204
|
Menashe I, Maeder D, Garcia-Closas M, Figueroa JD, Bhattacharjee S, Rotunno M, Kraft P, Hunter DJ, Chanock SJ, Rosenberg PS, Chatterjee N. Pathway analysis of breast cancer genome-wide association study highlights three pathways and one canonical signaling cascade. Cancer Res 2010; 70:4453-9. [PMID: 20460509 DOI: 10.1158/0008-5472.can-09-4502] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Genome-wide association studies (GWAS) focus on relatively few highly significant loci, whereas less attention is given to other genotyped markers. Using pathway analysis to study existing GWAS data may shed light on relevant biological processes and illuminate new candidate genes. We applied a pathway-based approach to the breast cancer GWAS data of the National Cancer Institute (NCI) Cancer Genetic Markers of Susceptibility project that includes 1,145 cases and 1,142 controls. Pathways were retrieved from three databases: KEGG, BioCarta, and NCI Protein Interaction Database. Genes were represented by their most strongly associated SNP, and an enrichment score reflecting the overrepresentation of gene-based association signals in each pathway was calculated by using a weighted Kolmogorov-Smirnov procedure. Finally, hierarchical clustering was used to identify pathways with overlapping genes, and clusters with an excess of association signals were determined by the adaptive rank-truncated product (ARTP) method. A total of 421 pathways containing 3,962 genes was included in our study. Of these, three pathways (syndecan-1-mediated signaling, signaling of hepatocyte growth factor receptor, and growth hormone signaling) were highly enriched with association signals [P(ES) < 0.001, false discovery rate (FDR) = 0.118]. Our clustering analysis revealed that pathways containing key components of the RAS/RAF/mitogen-activated protein kinase canonical signaling cascade were significantly more likely to have an excess of association signals than expected by chance (P(ARTP) = 0.0051, FDR = 0.07). These results suggest that genetic alterations associated with these three pathways and one canonical signaling cascade may contribute to breast cancer susceptibility.
Collapse
Affiliation(s)
- Idan Menashe
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Department of Health and Human Services, Bethesda, MD20852-7244, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
205
|
Zhang K, Cui S, Chang S, Zhang L, Wang J. i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study. Nucleic Acids Res 2010; 38:W90-5. [PMID: 20435672 PMCID: PMC2896119 DOI: 10.1093/nar/gkq324] [Citation(s) in RCA: 154] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Genome-wide association study (GWAS) is nowadays widely used to identify genes involved in human complex disease. The standard GWAS analysis examines SNPs/genes independently and identifies only a number of the most significant SNPs. It ignores the combined effect of weaker SNPs/genes, which leads to difficulties to explore biological function and mechanism from a systems point of view. Although gene set enrichment analysis (GSEA) has been introduced to GWAS to overcome these limitations by identifying the correlation between pathways/gene sets and traits, the heavy dependence on genotype data, which is not easily available for most published GWAS investigations, has led to limited application of it. In order to perform GSEA on a simple list of GWAS SNP P-values, we implemented GSEA by using SNP label permutation. We further improved GSEA (i-GSEA) by focusing on pathways/gene sets with high proportion of significant genes. To provide researchers an open platform to analyze GWAS data, we developed the i-GSEA4GWAS (improved GSEA for GWAS) web server. i-GSEA4GWAS implements the i-GSEA approach and aims to provide new insights in complex disease studies. i-GSEA4GWAS is freely available at http://gsea4gwas.psych.ac.cn/.
Collapse
Affiliation(s)
- Kunlin Zhang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 100101, Beijing, China
| | | | | | | | | |
Collapse
|
206
|
Zhong H, Yang X, Kaplan LM, Molony C, Schadt EE. Integrating pathway analysis and genetics of gene expression for genome-wide association studies. Am J Hum Genet 2010; 86:581-91. [PMID: 20346437 DOI: 10.1016/j.ajhg.2010.02.020] [Citation(s) in RCA: 178] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Revised: 01/15/2010] [Accepted: 02/10/2010] [Indexed: 12/24/2022] Open
Abstract
Genome-wide association studies (GWAS) have achieved great success identifying common genetic variants associated with common human diseases. However, to date, the massive amounts of data generated from GWAS have not been maximally leveraged and integrated with other types of data to identify associations beyond those associations that meet the stringent genome-wide significance threshold. Here, we present a novel approach that leverages information from genetics of gene expression studies to identify biological pathways enriched for expression-associated genetic loci associated with disease in publicly available GWAS results. Specifically, we first identify SNPs in population-based human cohorts that associate with the expression of genes (eSNPs) in the metabolically active tissues liver, subcutaneous adipose, and omental adipose. We then use this functionally annotated set of SNPs to investigate pathways enriched for eSNPs associated with disease in publicly available GWAS data. As an example, we tested 110 pathways from the Kyoto Encylopedia of Genes and Genomes (KEGG) database and identified 16 pathways enriched for genes corresponding to eSNPs that show evidence of association with type 2 diabetes (T2D) in the Wellcome Trust Case Control Consortium (WTCCC) T2D GWAS. We then replicated these findings in the Diabetes Genetics Replication and Meta-analysis (DIAGRAM) study. Many of the pathways identified have been proposed as important candidate pathways for T2D, including the calcium signaling pathway, the PPAR signaling pathway, and TGF-beta signaling. Importantly, we identified other pathways not previously associated with T2D, including the tight junction, complement and coagulation pathway, and antigen processing and presentation pathway. The integration of pathways and eSNPs provides putative functional bridges between GWAS and candidate genes or pathways, thus serving as a potential powerful approach to identifying biological mechanisms underlying GWAS findings.
Collapse
|
207
|
Zhang H, Olschwang S, Yu K. Statistical inference on the penetrances of rare genetic mutations based on a case-family design. Biostatistics 2010; 11:519-32. [PMID: 20179148 DOI: 10.1093/biostatistics/kxq009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We propose a formal statistical inference framework for the evaluation of the penetrance of a rare genetic mutation using family data generated under a kin-cohort type of design, where phenotype and genotype information from first-degree relatives (sibs and/or offspring) of case probands carrying the targeted mutation are collected. Our approach is built upon a likelihood model with some minor assumptions, and it can be used for age-dependent penetrance estimation that permits adjustment for covariates. Furthermore, the derived likelihood allows unobserved risk factors that are correlated within family members. The validity of the approach is confirmed by simulation studies. We apply the proposed approach to estimating the age-dependent cancer risk among carriers of the MSH2 or MLH1 mutation.
Collapse
Affiliation(s)
- Hong Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | | | | |
Collapse
|
208
|
Pattin KA, Moore JH. Role for protein-protein interaction databases in human genetics. Expert Rev Proteomics 2010; 6:647-59. [PMID: 19929610 DOI: 10.1586/epr.09.86] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Proteomics and the study of protein-protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein-protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein-protein interactions in human genetics and genetic epidemiology. Since protein-protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies.
Collapse
Affiliation(s)
- Kristine A Pattin
- Computational Genetics Laboratory and Department of Genetics, Dartmouth Medical School, Lebanon, NH, USA.
| | | |
Collapse
|
209
|
Wang SS, Gonzalez P, Yu K, Porras C, Li Q, Safaeian M, Rodriguez AC, Sherman ME, Bratti C, Schiffman M, Wacholder S, Burk RD, Herrero R, Chanock SJ, Hildesheim A. Common genetic variants and risk for HPV persistence and progression to cervical cancer. PLoS One 2010; 5:e8667. [PMID: 20084279 PMCID: PMC2801608 DOI: 10.1371/journal.pone.0008667] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Accepted: 12/18/2009] [Indexed: 11/19/2022] Open
Abstract
HPV infrequently persists and progresses to cervical cancer. We examined host genetic factors hypothesized to play a role in determining which subset of individuals infected with oncogenic human papillomavirus (HPV) have persistent infection and further develop cervical pre-cancer/cancer compared to the majority of infected individuals who will clear infection. We evaluated 7140 tag single nucleotide polymorphisms (SNPs) from 305 candidate genes hypothesized to be involved in DNA repair, viral infection and cell entry in 416 cervical intraepithelial neoplasia 3 (CIN3)/cancer cases, 356 HPV persistent women (median: 25 months), and 425 random controls (RC) from the 10,049 women Guanacaste Costa Rica Natural History study. We used logistic regression to compute odds ratios and p-trend for CIN3/cancer and HPV persistence in relation to SNP genotypes and haplotypes (adjusted for age). We obtained pathway and gene-level summary of associations by computing the adaptive combination of p-values. Genes/regions statistically significantly associated with CIN3/cancer included the viral infection and cell entry genes 2′,5′ oligoadenylate synthetase gene 3 (OAS3), sulfatase 1 (SULF1), and interferon gamma (IFNG); the DNA repair genes deoxyuridine triphosphate (DUT), dosage suppressor of mck 1 homolog (DMC1), and general transcription factor IIH, polypeptide 3 (GTF2H4); and the EVER1 and EVER2 genes (p<0.01). From each region, the single most significant SNPs associated with CIN3/cancer were OAS3 rs12302655, SULF1 rs4737999, IFNG rs11177074, DUT rs3784621, DMC1 rs5757133, GTF2H4 rs2894054, EVER1/EVER2 rs9893818 (p-trends≤0.001). SNPs for OAS3, SULF1, DUT, and GTF2H4 were associated with HPV persistence whereas IFNG and EVER1/EVER2 SNPs were associated with progression to CIN3/cancer. We note that the associations observed were less than two-fold. We identified variations DNA repair and viral binding and cell entry genes associated with CIN3/cancer. Our results require replication but suggest that different genes may be responsible for modulating risk in the two critical transition steps important for cervical carcinogenesis: HPV persistence and disease progression.
Collapse
Affiliation(s)
- Sophia S Wang
- Division of Etiology, City of Hope National Medical Center, Duarte, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
210
|
Abstract
Motivation: The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype–phenotype relationship that is characterized by significant heterogeneity and gene–gene and gene–environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods. Contact:jason.h.moore@dartmouth.edu
Collapse
Affiliation(s)
- Jason H Moore
- Department of Genetics, Department of Community and Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA.
| | | | | |
Collapse
|
211
|
Yang HP, Gonzalez Bosquet J, Li Q, Platz EA, Brinton LA, Sherman ME, Lacey JV, Gaudet MM, Burdette LA, Figueroa JD, Ciampa JG, Lissowska J, Peplonska B, Chanock SJ, Garcia-Closas M. Common genetic variation in the sex hormone metabolic pathway and endometrial cancer risk: pathway-based evaluation of candidate genes. Carcinogenesis 2010; 31:827-33. [PMID: 20053928 DOI: 10.1093/carcin/bgp328] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Estrogen plays a major role in endometrial carcinogenesis, suggesting that common variants of genes in the sex hormone metabolic pathway may be related to endometrial cancer risk. In support of this view, variants in CYP19A1 [cytochrome P450 (CYP), family 19, subfamily A, polypeptide 1] have been associated with both circulating estrogen levels and endometrial cancer risk. Associations with variants in other genes have been suggested, but findings have been inconsistent. METHODS We examined 36 sex hormone-related genes using a tagging approach in a population-based case-control study of 417 endometrial cancer cases and 407 controls conducted in Poland. We evaluated common variation in these genes in relation to endometrial cancer risk using sequential haplotype scan, variable-sized sliding window and adaptive rank-truncated product (ARTP) methods. RESULTS In our case-control study, the strongest association with endometrial cancer risk was for AR (androgen receptor; ARTP P = 0.006). Multilocus analyses also identified boundaries for a region of interest in AR and in CYP19A1 around a previously identified susceptibility loci. We did not find evidence for consistent associations between previously reported candidate single-nucleotide polymorphisms in this pathway and endometrial cancer risk. DISCUSSION In summary, we identified regions in AR and CYP19A1 that are of interest for further evaluation in relation to endometrial cancer risk in future haplotype and subsequent fine mapping studies in larger study populations.
Collapse
Affiliation(s)
- Hannah P Yang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20852, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
212
|
Holmans P. Statistical methods for pathway analysis of genome-wide data for association with complex genetic traits. ADVANCES IN GENETICS 2010; 72:141-79. [PMID: 21029852 DOI: 10.1016/b978-0-12-380862-2.00007-2] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A number of statistical methods have been developed to test for associations between pathways (collections of genes related biologically) and complex genetic traits. Pathway analysis methods were originally developed for analyzing gene expression data, but recently methods have been developed to perform pathway analysis on genome-wide association study (GWAS) data. The purpose of this review is to give an overview of these methods, enabling the reader to gain an understanding of what pathway analysis involves, and to select the method most suited to their purposes. This review describes the various types of statistical methods for pathway analysis, detailing the strengths and weaknesses of each. Factors influencing the power of pathway analyses, such as gene coverage and choice of pathways to analyze, are discussed, as well as various unresolved statistical issues. Finally, a list of computer programs for performing pathway analysis on genome-wide association data is provided.
Collapse
Affiliation(s)
- Peter Holmans
- Biostatistics and Bioinformatics Unit, MRC Centre for Neuropsychiatric Genetics and Genomics, Department of Psychological Medicine and Neurology, Cardiff University School of Medicine, Heath Park, Cardiff, United Kingdom
| |
Collapse
|
213
|
Cantor RM, Lange K, Sinsheimer JS. Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am J Hum Genet 2010; 86:6-22. [PMID: 20074509 DOI: 10.1016/j.ajhg.2009.11.017] [Citation(s) in RCA: 422] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 11/10/2009] [Accepted: 11/20/2009] [Indexed: 12/27/2022] Open
Abstract
Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach.
Collapse
|
214
|
Guo YF, Li J, Chen Y, Zhang LS, Deng HW. A new permutation strategy of pathway-based approach for genome-wide association study. BMC Bioinformatics 2009; 10:429. [PMID: 20021635 PMCID: PMC2809078 DOI: 10.1186/1471-2105-10-429] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2009] [Accepted: 12/18/2009] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Recently introduced pathway-based approach is promising and advantageous to improve the efficiency of analyzing genome-wide association scan (GWAS) data to identify disease variants by jointly considering variants of the genes that belong to the same biological pathway. However, the current available pathway-based approaches for analyzing GWAS have limited power and efficiency. RESULTS We proposed a new and efficient permutation strategy based on SNP randomization for determining significance in pathway analysis of GWAS. The developed permutation strategy was evaluated and compared to two previously available methods, i.e. sample permutation and gene permutation, through simulation studies and a study on a real dataset. Results showed that the proposed permutation strategy is more powerful and efficient with greatly reducing the computational complexity. CONCLUSION Our findings indicate the improved performance of SNP permutation and thus render pathway-based analysis of GWAS more applicable and attractive.
Collapse
Affiliation(s)
- Yan-Fang Guo
- 1School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, PR China.
| | | | | | | | | |
Collapse
|
215
|
Chatterjee N, Chen YH, Luo S, Carroll RJ. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes. Stat Sci 2009; 24:489-502. [PMID: 20543902 DOI: 10.1214/09-sts297] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article, we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data.
Collapse
Affiliation(s)
- Nilanjan Chatterjee
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, DHHS. Rockville MD 20852, U.S.A
| | | | | | | |
Collapse
|
216
|
|
217
|
Epistasis and its implications for personal genetics. Am J Hum Genet 2009; 85:309-20. [PMID: 19733727 DOI: 10.1016/j.ajhg.2009.08.006] [Citation(s) in RCA: 241] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2009] [Revised: 07/31/2009] [Accepted: 08/10/2009] [Indexed: 12/22/2022] Open
Abstract
The widespread availability of high-throughput genotyping technology has opened the door to the era of personal genetics, which brings to consumers the promise of using genetic variations to predict individual susceptibility to common diseases. Despite easy access to commercial personal genetics services, our knowledge of the genetic architecture of common diseases is still very limited and has not yet fulfilled the promise of accurately predicting most people at risk. This is partly because of the complexity of the mapping relationship between genotype and phenotype that is a consequence of epistasis (gene-gene interaction) and other phenomena such as gene-environment interaction and locus heterogeneity. Unfortunately, these aspects of genetic architecture have not been addressed in most of the genetic association studies that provide the knowledge base for interpreting large-scale genetic association results. We provide here an introductory review of how epistasis can affect human health and disease and how it can be detected in population-based studies. We provide some thoughts on the implications of epistasis for personal genetics and some recommendations for improving personal genetics in light of this complexity.
Collapse
|