Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: He H, Oetting WS, Brott MJ, Basu S. Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene interaction in a case-control study. BMC Med Genet 2009;10:127. [PMID: 19961594 PMCID: PMC2800840 DOI: 10.1186/1471-2350-10-127] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2009] [Accepted: 12/04/2009] [Indexed: 11/13/2022]

For:	He H, Oetting WS, Brott MJ, Basu S. Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene interaction in a case-control study. BMC Med Genet 2009;10:127. [PMID: 19961594 PMCID: PMC2800840 DOI: 10.1186/1471-2350-10-127] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2009] [Accepted: 12/04/2009] [Indexed: 11/13/2022]

Number

Cited by Other Article(s)

Luyapan J, Ji X, Li S, Xiao X, Zhu D, Duell EJ, Christiani DC, Schabath MB, Arnold SM, Zienolddiny S, Brunnström H, Melander O, Thornquist MD, MacKenzie TA, Amos CI, Gui J. A new efficient method to detect genetic interactions for lung cancer GWAS. BMC Med Genomics 2020;13:162. [PMID: 33126877 PMCID: PMC7596958 DOI: 10.1186/s12920-020-00807-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/11/2020] [Indexed: 01/01/2023] Open

Abstract

BACKGROUND

Genome-wide association studies (GWAS) have proven successful in predicting genetic risk of disease using single-locus models; however, identifying single nucleotide polymorphism (SNP) interactions at the genome-wide scale is limited due to computational and statistical challenges. We addressed the computational burden encountered when detecting SNP interactions for survival analysis, such as age of disease-onset. To confront this problem, we developed a novel algorithm, called the Efficient Survival Multifactor Dimensionality Reduction (ES-MDR) method, which used Martingale Residuals as the outcome parameter to estimate survival outcomes, and implemented the Quantitative Multifactor Dimensionality Reduction method to identify significant interactions associated with age of disease-onset.

METHODS

To demonstrate efficacy, we evaluated this method on two simulation data sets to estimate the type I error rate and power. Simulations showed that ES-MDR identified interactions using less computational workload and allowed for adjustment of covariates. We applied ES-MDR on the OncoArray-TRICL Consortium data with 14,935 cases and 12,787 controls for lung cancer (SNPs = 108,254) to search over all two-way interactions to identify genetic interactions associated with lung cancer age-of-onset. We tested the best model in an independent data set from the OncoArray-TRICL data.

RESULTS

Our experiment on the OncoArray-TRICL data identified many one-way and two-way models with a single-base deletion in the noncoding region of BRCA1 (HR 1.24, P = 3.15 × 10-15), as the top marker to predict age of lung cancer onset.

CONCLUSIONS

From the results of our extensive simulations and analysis of a large GWAS study, we demonstrated that our method is an efficient algorithm that identified genetic interactions to include in our models to predict survival outcomes.

Collapse

Affiliation(s)

Jennifer Luyapan Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
Xuemei Ji Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
Siting Li Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
Xiangjun Xiao Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
Dakai Zhu Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
Eric J Duell Unit of Nutrition and Cancer, Catalan Institute of Oncology (ICO-IDIBELL), 08908, Barcelona, Spain
David C Christiani Department of Environmental Health, Harvard School of Public Health, Boston, MA, 02115, USA Department of Medicine, Massachusetts General Hospital, Boston, MA, 02115, USA
Matthew B Schabath Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
Susanne M Arnold Markey Cancer Center, University of Kentucky, First Floor, 800 Rose Street, Lexington, KY, 40508, USA
Shanbeh Zienolddiny National Institute of Occupational Health, 0033 Gydas vei 8, 0033, Oslo, Norway
Hans Brunnström Laboratory Medicine Region Skåne, Department of Clinical Sciences Lund, Pathology, Lund University, Lund, Sweden
Olle Melander Department of Clinical Sciences, Lund University, Malmö, Sweden
Mark D Thornquist Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
Todd A MacKenzie Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
Christopher I Amos Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA. Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA. Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.
Jiang Gui Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA. Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA.

Collapse

Tessier F, Fontaine-Bisson B, Lefebvre JF, El-Sohemy A, Roy-Gagnon MH. Investigating Gene-Gene and Gene-Environment Interactions in the Association Between Overnutrition and Obesity-Related Phenotypes. Front Genet 2019;10:151. [PMID: 30886629 PMCID: PMC6409307 DOI: 10.3389/fgene.2019.00151] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Accepted: 02/12/2019] [Indexed: 01/12/2023] Open

Yang CH, Lin YD, Yen CY, Chuang LY, Chang HW. A systematic gene-gene and gene-environment interaction analysis of DNA repair genes XRCC1, XRCC2, XRCC3, XRCC4, and oral cancer risk. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2016;19:238-47. [PMID: 25831063 DOI: 10.1089/omi.2014.0121] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Abstract

Oral cancer is the sixth most common cancer worldwide with a high mortality rate. Biomarkers that anticipate susceptibility, prognosis, or response to treatments are much needed. Oral cancer is a polygenic disease involving complex interactions among genetic and environmental factors, which require multifaceted analyses. Here, we examined in a dataset of 103 oral cancer cases and 98 controls from Taiwan the association between oral cancer risk and the DNA repair genes X-ray repair cross-complementing group (XRCCs) 1-4, and the environmental factors of smoking, alcohol drinking, and betel quid (BQ) chewing. We employed logistic regression, multifactor dimensionality reduction (MDR), and hierarchical interaction graphs for analyzing gene-gene (G×G) and gene-environment (G×E) interactions. We identified a significantly elevated risk of the XRCC2 rs2040639 heterozygous variant among smokers [adjusted odds ratio (OR) 3.7, 95% confidence interval (CI)=1.1-12.1] and alcohol drinkers [adjusted OR=5.7, 95% CI=1.4-23.2]. The best two-factor based G×G interaction of oral cancer included the XRCC1 rs1799782 and XRCC2 rs2040639 [OR=3.13, 95% CI=1.66-6.13]. For the G×E interaction, the estimated OR of oral cancer for two (drinking-BQ chewing), three (XRCC1-XRCC2-BQ chewing), four (XRCC1-XRCC2-age-BQ chewing), and five factors (XRCC1-XRCC2-age-drinking-BQ chewing) were 32.9 [95% CI=14.1-76.9], 31.0 [95% CI=14.0-64.7], 49.8 [95% CI=21.0-117.7] and 82.9 [95% CI=31.0-221.5], respectively. Taken together, the genotypes of XRCC1 rs1799782 and XRCC2 rs2040639 DNA repair genes appear to be significantly associated with oral cancer. These were enhanced by exposure to certain environmental factors. The observations presented here warrant further research in larger study samples to examine their relevance for routine clinical care in oncology.

Collapse

Acikel C, Aydin Son Y, Celik C, Gul H. Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data. Neuropsychiatr Dis Treat 2016;12:2997-3004. [PMID: 27920536 PMCID: PMC5127431 DOI: 10.2147/ndt.s112558] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Abstract

BACKGROUND

Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders.

METHODS

This study was performed on Whole-Genome Association Study of Bipolar Disorder (dbGaP [database of Genotypes and Phenotypes] study accession number: phs000017.v3.p1) data. After preprocessing of the genotyping data, three classification-based data mining methods (ie, random forest, naïve Bayes, and k-nearest neighbor) were performed. Additionally, as a nonparametric, model-free approach, the MDR method was used to evaluate the SNP profiles. The validity of these methods was evaluated using true classification rate, recall (sensitivity), precision (positive predictive value), and F-measure.

RESULTS

Random forests, naïve Bayes, and k-nearest neighbors identified 16, 13, and ten candidate SNPs, respectively. Surprisingly, the top six SNPs were reported by all three methods. Random forests and k-nearest neighbors were more successful than naïve Bayes, with recall values >0.95. On the other hand, MDR generated a model with comparable predictive performance based on five SNPs. Although different SNP profiles were identified in MDR compared to the classification-based models, all models mapped SNPs to the DOCK10 gene.

CONCLUSION

Three classification-based data mining approaches, random forests, naïve Bayes, and k-nearest neighbors, have prioritized similar SNP profiles as predictors of bipolar disorders, in contrast to MDR, which has found different SNPs through analysis of two-way and three-way interactions. The reduced number of associated SNPs discovered by MDR, without loss in the classification performance, would facilitate validation studies and decision support models, and would reduce the cost to develop predictive and diagnostic tests. Nevertheless, we need to emphasize that translation of genomic models to the clinical setting requires models with higher classification performance.

Collapse

Huh I, Kwon MS, Park T. An Efficient Stepwise Statistical Test to Identify Multiple Linked Human Genetic Variants Associated with Specific Phenotypic Traits. PLoS One 2015;10:e0138700. [PMID: 26406920 PMCID: PMC4583484 DOI: 10.1371/journal.pone.0138700] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Accepted: 09/02/2015] [Indexed: 11/19/2022] Open

Gola D, Mahachie John JM, van Steen K, König IR. A roadmap to multifactor dimensionality reduction methods. Brief Bioinform 2015;17:293-308. [PMID: 26108231 PMCID: PMC4793893 DOI: 10.1093/bib/bbv038] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Indexed: 02/02/2023] Open

Li P, Guo M, Wang C, Liu X, Zou Q. An overview of SNP interactions in genome-wide association studies. Brief Funct Genomics 2014;14:143-55. [PMID: 25241224 DOI: 10.1093/bfgp/elu036] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

Introduction to statistical methods for microRNA analysis. Methods Mol Biol 2014;1107:129-55. [PMID: 24272435 DOI: 10.1007/978-1-62703-748-8_8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Gui J, Moore JH, Williams SM, Andrews P, Hillege HL, van der Harst P, Navis G, Van Gilst WH, Asselbergs FW, Gilbert-Diamond D. A Simple and Computationally Efficient Approach to Multifactor Dimensionality Reduction Analysis of Gene-Gene Interactions for Quantitative Traits. PLoS One 2013;8:e66545. [PMID: 23805232 PMCID: PMC3689797 DOI: 10.1371/journal.pone.0066545] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Accepted: 05/07/2013] [Indexed: 12/03/2022] Open

Affiliation(s)

Jiang Gui Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
Jason H. Moore Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America Department of Genetics, Geisel School of Medicine, Lebanon, New Hampshire, United States of America * E-mail:
Scott M. Williams Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America Department of Genetics, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
Peter Andrews Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America
Hans L. Hillege Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
Pim van der Harst Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
Gerjan Navis Department of Nephrology, University Medical Center Groningen, Groningen, The Netherlands
Wiek H. Van Gilst Department of Cardiology, University Medical Center Groningen, Groningen, The Netherlands
Folkert W. Asselbergs Department of Cardiology, Division of Heart and Lungs, University Medical Center Utrecht, Utrecht, The Netherlands
Diane Gilbert-Diamond Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Lebanon, New Hampshire, United States of America Section of Biostatistics and Epidemiology, Departments of Community and Family Medicine, Geisel School of Medicine, Lebanon, New Hampshire, United States of America

Collapse

Fang YH, Chiu YF. SVM-based generalized multifactor dimensionality reduction approaches for detecting gene-gene interactions in family studies. Genet Epidemiol 2013;36:88-98. [PMID: 22851472 DOI: 10.1002/gepi.21602] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions. PLoS One 2012;7:e33531. [PMID: 22536319 PMCID: PMC3334940 DOI: 10.1371/journal.pone.0033531] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2011] [Accepted: 02/10/2012] [Indexed: 11/19/2022] Open

Abstract

There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations.

Collapse

Shang J, Zhang J, Sun Y, Liu D, Ye D, Yin Y. Performance analysis of novel methods for detecting epistasis. BMC Bioinformatics 2011;12:475. [PMID: 22172045 PMCID: PMC3259123 DOI: 10.1186/1471-2105-12-475] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2011] [Accepted: 12/15/2011] [Indexed: 02/03/2023] Open

Abstract

Background

Epistasis is recognized fundamentally important for understanding the mechanism of disease-causing genetic variation. Though many novel methods for detecting epistasis have been proposed, few studies focus on their comparison. Undertaking a comprehensive comparison study is an urgent task and a pathway of the methods to real applications.

Results

This paper aims at a comparison study of epistasis detection methods through applying related software packages on datasets. For this purpose, we categorize methods according to their search strategies, and select five representative methods (TEAM, BOOST, SNPRuler, AntEpiSeeker and epiMODE) originating from different underlying techniques for comparison. The methods are tested on simulated datasets with different size, various epistasis models, and with/without noise. The types of noise include missing data, genotyping error and phenocopy. Performance is evaluated by detection power (three forms are introduced), robustness, sensitivity and computational complexity.

Conclusions

None of selected methods is perfect in all scenarios and each has its own merits and limitations. In terms of detection power, AntEpiSeeker performs best on detecting epistasis displaying marginal effects (eME) and BOOST performs best on identifying epistasis displaying no marginal effects (eNME). In terms of robustness, AntEpiSeeker is robust to all types of noise on eME models, BOOST is robust to genotyping error and phenocopy on eNME models, and SNPRuler is robust to phenocopy on eME models and missing data on eNME models. In terms of sensitivity, AntEpiSeeker is the winner on eME models and both SNPRuler and BOOST perform well on eNME models. In terms of computational complexity, BOOST is the fastest among the methods. In terms of overall performance, AntEpiSeeker and BOOST are recommended as the efficient and effective methods. This comparison study may provide guidelines for applying the methods and further clues for epistasis detection.

Collapse

Oki NO, Motsinger-Reif AA. Multifactor dimensionality reduction as a filter-based approach for genome wide association studies. Front Genet 2011;2:80. [PMID: 22303374 PMCID: PMC3268633 DOI: 10.3389/fgene.2011.00080] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2011] [Accepted: 10/26/2011] [Indexed: 11/13/2022] Open

Abstract

Advances in genotyping technology and the multitude of genetic data available now provide a vast amount of data that is proving to be useful in the quest for a better understanding of human genetic diseases through the study of genetic variation. This has led to the development of approaches such as genome wide association studies (GWAS) designed specifically for interrogating variants across the genome for association with disease, typically by testing single locus, univariate associations. More recently it has been accepted that epistatic (interaction) effects may also be great contributors to these genetic effects, and GWAS methods are now being applied to find epistatic effects. The challenge for these methods still remain in prioritization and interpretation of results, as it has also become standard for initial findings to be independently investigated in replication cohorts or functional studies. This is motivating the development and implementation of filter-based approaches to prioritize variants found to be significant in a discovery stage for follow-up for replication. Such filters must be able to detect both univariate and interactive effects. In the current study we present and evaluate the use of multifactor dimensionality reduction (MDR) as such a filter, with simulated data and a wide range of effect sizes. Additionally, we compare the performance of the MDR filter to a similar filter approach using logistic regression (LR), the more traditional approach used in GWAS analysis, as well as evaporative cooling (EC)-another prominent machine learning filtering method. The results of our simulation study show that MDR is an effective method for such prioritization, and that it can detect main effects, and interactions with or without marginal effects. Importantly, it performed as well as EC and LR for main effect models. It also significantly outperforms LR for various two-locus epistatic models, while it has equivalent results as EC for the epistatic models. The results of this study demonstrate the potential of MDR as a filter to detect gene-gene interactions in GWAS studies.

Collapse

LI FG, WANG ZP, HU G, LI H. Current status of SNPs interaction in genome-wide association study. YI CHUAN = HEREDITAS 2011;33:901-10. [DOI: 10.3724/sp.j.1005.2011.00901] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Pereira TV, Mingroni-Netto RC, Yamada Y. ADRB2 and LEPR gene polymorphisms: synergistic effects on the risk of obesity in Japanese. Obesity (Silver Spring) 2011;19:1523-7. [PMID: 21233812 DOI: 10.1038/oby.2010.322] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

A comparison of multifactor dimensionality reduction and L1-penalized regression to identify gene-gene interactions in genetic association studies. Stat Appl Genet Mol Biol 2011;10:Article 4. [PMID: 21291414 DOI: 10.2202/1544-6115.1613] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Gui J, Andrew AS, Andrews P, Nelson HM, Kelsey KT, Karagas MR, Moore JH. A robust multifactor dimensionality reduction method for detecting gene-gene interactions with application to the genetic analysis of bladder cancer susceptibility. Ann Hum Genet 2010;75:20-8. [PMID: 21091664 DOI: 10.1111/j.1469-1809.2010.00624.x] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Gui J, Moore JH, Kelsey KT, Marsit CJ, Karagas MR, Andrew AS. A novel survival multifactor dimensionality reduction method for detecting gene-gene interactions with application to bladder cancer prognosis. Hum Genet 2010;129:101-10. [PMID: 20981448 DOI: 10.1007/s00439-010-0905-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Accepted: 10/17/2010] [Indexed: 11/30/2022]

Abstract

The widespread use of high-throughput methods of single nucleotide polymorphism (SNP) genotyping has created a number of computational and statistical challenges. The problem of identifying SNP-SNP interactions in case-control studies has been studied extensively, and a number of new techniques have been developed. Little progress has been made, however, in the analysis of SNP-SNP interactions in relation to time-to-event data, such as patient survival time or time to cancer relapse. We present an extension of the two class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP-SNP interactions in the context of survival analysis. The proposed Survival MDR (Surv-MDR) method handles survival data by modifying MDR's constructive induction algorithm to use the log-rank test. Surv-MDR replaces balanced accuracy with log-rank test statistics as the score to determine the best models. We simulated datasets with a survival outcome related to two loci in the absence of any marginal effects. We compared Surv-MDR with Cox-regression for their ability to identify the true predictive loci in these simulated data. We also used this simulation to construct the empirical distribution of Surv-MDR's testing score. We then applied Surv-MDR to genetic data from a population-based epidemiologic study to find prognostic markers of survival time following a bladder cancer diagnosis. We identified several two-loci SNP combinations that have strong associations with patients' survival outcome. Surv-MDR is capable of detecting interaction models with weak main effects. These epistatic models tend to be dropped by traditional Cox regression approaches to evaluating interactions. With improved efficiency to handle genome wide datasets, Surv-MDR will play an important role in a research strategy that embraces the complexity of the genotype-phenotype mapping relationship since epistatic interactions are an important component of the genetic basis of disease.

Collapse