1
|
Yasmeen S, Burger P, Friedrichs S, Papiol S, Bickeböller H. Relating drug response to epigenetic and genetic markers using a region-based kernel score test. BMC Proc 2018; 12:47. [PMID: 30275895 PMCID: PMC6157113 DOI: 10.1186/s12919-018-0154-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
2
|
Friedrichs S, Manitz J, Burger P, Amos CI, Risch A, Chang-Claude J, Wichmann HE, Kneib T, Bickeböller H, Hofner B. Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017; 2017:6742763. [PMID: 28785300 PMCID: PMC5530424 DOI: 10.1155/2017/6742763] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Revised: 04/15/2017] [Accepted: 05/10/2017] [Indexed: 01/24/2023]
Abstract
The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility.
Collapse
Affiliation(s)
- Stefanie Friedrichs
- Institute of Genetic Epidemiology, University Medical Centre, Georg-August University Göttingen, Göttingen, Germany
| | - Juliane Manitz
- Department of Statistics and Econometrics, Georg-August University Göttingen, Göttingen, Germany
- Department of Mathematics and Statistics, Boston University, Boston, MA, USA
| | - Patricia Burger
- Institute of Genetic Epidemiology, University Medical Centre, Georg-August University Göttingen, Göttingen, Germany
| | - Christopher I. Amos
- Department of Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Lebanon, NH, USA
| | - Angela Risch
- Division of Molecular Biology, University of Salzburg, Salzburg, Austria
- Translational Lung Research Center Heidelberg (TLRC-H), Member of the German Center for Lung Research (DZL), Heidelberg, Germany
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Heinz-Erich Wichmann
- Institute of Medical Informatics, Biometry and Epidemiology, Chair of Epidemiology, Ludwig-Maximilians University, Munich, Germany
- Helmholtz Center Munich, Institute of Epidemiology II, Munich, Germany
- Institute of Medical Statistics and Epidemiology, Technical University Munich, Munich, Germany
| | - Thomas Kneib
- Department of Statistics and Econometrics, Georg-August University Göttingen, Göttingen, Germany
| | - Heike Bickeböller
- Institute of Genetic Epidemiology, University Medical Centre, Georg-August University Göttingen, Göttingen, Germany
| | - Benjamin Hofner
- Department of Medical Informatics, Biometry and Epidemiology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
- Section Biostatistics, Paul-Ehrlich-Institut, Langen, Germany
| |
Collapse
|
3
|
Rosenberger A, Sohns M, Friedrichs S, Hung RJ, Fehringer G, McLaughlin J, Amos CI, Brennan P, Risch A, Brüske I, Caporaso NE, Landi MT, Christiani DC, Wei Y, Bickeböller H. Gene-set meta-analysis of lung cancer identifies pathway related to systemic lupus erythematosus. PLoS One 2017; 12:e0173339. [PMID: 28273134 PMCID: PMC5342225 DOI: 10.1371/journal.pone.0173339] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 02/20/2017] [Indexed: 02/03/2023] Open
Abstract
INTRODUCTION Gene-set analysis (GSA) is an approach using the results of single-marker genome-wide association studies when investigating pathways as a whole with respect to the genetic basis of a disease. METHODS We performed a meta-analysis of seven GSAs for lung cancer, applying the method META-GSA. Overall, the information taken from 11,365 cases and 22,505 controls from within the TRICL/ILCCO consortia was used to investigate a total of 234 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. RESULTS META-GSA reveals the systemic lupus erythematosus KEGG pathway hsa05322, driven by the gene region 6p21-22, as also implicated in lung cancer (p = 0.0306). This gene region is known to be associated with squamous cell lung carcinoma. The most important genes driving the significance of this pathway belong to the genomic areas HIST1-H4L, -1BN, -2BN, -H2AK, -H4K and C2/C4A/C4B. Within these areas, the markers most significantly associated with LC are rs13194781 (located within HIST12BN) and rs1270942 (located between C2 and C4A). CONCLUSIONS We have discovered a pathway currently marked as specific to systemic lupus erythematosus as being significantly implicated in lung cancer. The gene region 6p21-22 in this pathway appears to be more extensively associated with lung cancer than previously assumed. Given wide-stretched linkage disequilibrium to the area APOM/BAG6/MSH5, there is currently simply not enough information or evidence to conclude whether the potential pleiotropy of lung cancer and systemic lupus erythematosus is spurious, biological, or mediated. Further research into this pathway and gene region will be necessary.
Collapse
Affiliation(s)
- Albert Rosenberger
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University Göttingen, Göttingen, Germany
| | - Melanie Sohns
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University Göttingen, Göttingen, Germany
| | - Stefanie Friedrichs
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University Göttingen, Göttingen, Germany
| | - Rayjean J. Hung
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Gord Fehringer
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Canada
| | | | - Christopher I. Amos
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire, United States of America
| | - Paul Brennan
- International Agency for Research on Cancer, Lyon, France
| | - Angela Risch
- Division of Molecular Biology, University Salzburg, Salzburg, Austria
| | - Irene Brüske
- Institute of Epidemiology I, Helmholtz Center Munich, Munich, Germany
| | - Neil E. Caporaso
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, United States of America
| | - Maria Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland, United States of America
| | - David C. Christiani
- Harvard University School of Public Health, Boston, Massachusetts, United States of America
| | - Yongyue Wei
- Harvard University School of Public Health, Boston, Massachusetts, United States of America
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University Göttingen, Göttingen, Germany
| |
Collapse
|
4
|
Friedrichs S, Malzahn D, Pugh EW, Almeida M, Liu XQ, Bailey JN. Filtering genetic variants and placing informative priors based on putative biological function. BMC Genet 2016; 17 Suppl 2:8. [PMID: 26866982 PMCID: PMC4895695 DOI: 10.1186/s12863-015-0313-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.
Collapse
Affiliation(s)
- Stefanie Friedrichs
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Göttingen, Germany.
| | - Dörthe Malzahn
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Göttingen, Germany.
| | - Elizabeth W Pugh
- Center for Inherited Disease Research, Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| | - Marcio Almeida
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, TX, USA.
| | - Xiao Qing Liu
- Department of Obstetrics, Gynecology, and Reproductive Sciences, Department of Biochemistry and Medical Genetics, Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada.
- Children's Hospital Research Institute of Manitoba, Winnipeg, MB, Canada.
| | - Julia N Bailey
- Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA.
- Epilepsy Genetics/Genomics Laboratory, West Los Angeles Veterans Administration, Los Angeles, CA, USA.
| |
Collapse
|