1
|
Das Adhikari S, Cui Y, Wang J. BayesKAT: bayesian optimal kernel-based test for genetic association studies reveals joint genetic effects in complex diseases. Brief Bioinform 2024; 25:bbae182. [PMID: 38653490 PMCID: PMC11036342 DOI: 10.1093/bib/bbae182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/10/2024] [Accepted: 04/05/2024] [Indexed: 04/25/2024] Open
Abstract
Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
2
|
Das Adhikari S, Cui Y, Wang J. BayesKAT: Bayesian Optimal Kernel-based Test for genetic association studies reveals joint genetic effects in complex diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.18.562824. [PMID: 37905124 PMCID: PMC10614916 DOI: 10.1101/2023.10.18.562824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
GWAS methods have identified individual SNPs significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power, or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT( https://github.com/wangjr03/BayesKAT ), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules, and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.
Collapse
|
3
|
Lee E, Ibrahim JG, Zhu H. Bayesian bi-level variable selection for genome-wide survival study. Genomics Inform 2023; 21:e28. [PMID: 37813624 PMCID: PMC10584651 DOI: 10.5808/gi.23047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 10/11/2023] Open
Abstract
Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.
Collapse
Affiliation(s)
- Eunjee Lee
- Department of Information and Statistics, Chungnam National University, Daejeon 34134, Korea
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
4
|
Xu H, Shao Z, Zhang S, Liu X, Zeng P. How can childhood maltreatment affect post-traumatic stress disorder in adult: Results from a composite null hypothesis perspective of mediation analysis. Front Psychiatry 2023; 14:1102811. [PMID: 36970281 PMCID: PMC10033829 DOI: 10.3389/fpsyt.2023.1102811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/20/2023] [Indexed: 03/11/2023] Open
Abstract
BackgroundA greatly growing body of literature has revealed the mediating role of DNA methylation in the influence path from childhood maltreatment to psychiatric disorders such as post-traumatic stress disorder (PTSD) in adult. However, the statistical method is challenging and powerful mediation analyses regarding this issue are lacking.MethodsTo study how the maltreatment in childhood alters long-lasting DNA methylation changes which further affect PTSD in adult, we here carried out a gene-based mediation analysis from a perspective of composite null hypothesis in the Grady Trauma Project (352 participants and 16,565 genes) with childhood maltreatment as exposure, multiple DNA methylation sites as mediators, and PTSD or its relevant scores as outcome. We effectively addressed the challenging issue of gene-based mediation analysis by taking its composite null hypothesis testing nature into consideration and fitting a weighted test statistic.ResultsWe discovered that childhood maltreatment could substantially affected PTSD or PTSD-related scores, and that childhood maltreatment was associated with DNA methylation which further had significant roles in PTSD and these scores. Furthermore, using the proposed mediation method, we identified multiple genes within which DNA methylation sites exhibited mediating roles in the influence path from childhood maltreatment to PTSD-relevant scores in adult, with 13 for Beck Depression Inventory and 6 for modified PTSD Symptom Scale, respectively.ConclusionOur results have the potential to confer meaningful insights into the biological mechanism for the impact of early adverse experience on adult diseases; and our proposed mediation methods can be applied to other similar analysis settings.
Collapse
Affiliation(s)
- Haibo Xu
- Center for Mental Health Education and Research, Xuzhou Medical University, Xuzhou, China
- School of Management, Xuzhou Medical University, Xuzhou, China
- *Correspondence: Haibo Xu,
| | - Zhonghe Shao
- Department of Epidemiology and Biostatistics, Ministry of Education Key Laboratory of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Shuo Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Xin Liu
- Center for Mental Health Education and Research, Xuzhou Medical University, Xuzhou, China
- School of Management, Xuzhou Medical University, Xuzhou, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Ping Zeng,
| |
Collapse
|
5
|
van Puffelen JH, Novakovic B, van Emst L, Kooper D, Zuiverloon TCM, Oldenhof UTH, Witjes JA, Galesloot TE, Vrieling A, Aben KKH, Kiemeney LALM, Oosterwijk E, Netea MG, Boormans JL, van der Heijden AG, Joosten LAB, Vermeulen SH. Intravesical BCG in patients with non-muscle invasive bladder cancer induces trained immunity and decreases respiratory infections. J Immunother Cancer 2023; 11:jitc-2022-005518. [PMID: 36693678 PMCID: PMC9884868 DOI: 10.1136/jitc-2022-005518] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/29/2022] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND BCG is recommended as intravesical immunotherapy to reduce the risk of tumor recurrence in patients with non-muscle invasive bladder cancer (NMIBC). Currently, it is unknown whether intravesical BCG application induces trained immunity. METHODS The aim of this research was to determine whether BCG immunotherapy induces trained immunity in NMIBC patients. We conducted a prospective observational cohort study in 17 NMIBC patients scheduled for BCG therapy and measured trained immunity parameters at 9 time points before and during a 1-year BCG maintenance regimen. Ex vivo cytokine production by peripheral blood mononuclear cells, epigenetic modifications, and changes in the monocyte transcriptome were measured. The frequency of respiratory infections was investigated in two larger cohorts of BCG-treated and non-BCG treated NMIBC patients as a surrogate measurement of trained immunity. Gene-based association analysis of genetic variants in candidate trained immunity genes and their association with recurrence-free survival and progression-free survival after BCG therapy was performed to investigate the hypothesized link between trained immunity and clinical response. RESULTS We found that intravesical BCG does induce trained immunity based on an increased production of TNF and IL-1β after heterologous ex vivo stimulation of circulating monocytes 6-12 weeks after intravesical BCG treatment; and a 37% decreased risk (OR 0.63 (95% CI 0.40 to 1.01)) for respiratory infections in BCG-treated versus non-BCG-treated NMIBC patients. An epigenomics approach combining chromatin immuno precipitation-sequencing and RNA-sequencing with in vitro trained immunity experiments identified enhanced inflammasome activity in BCG-treated individuals. Finally, germline variation in genes that affect trained immunity was associated with recurrence and progression after BCG therapy in NMIBC. CONCLUSION We conclude that BCG immunotherapy induces trained immunity in NMIBC patients and this may account for the protective effects against respiratory infections. The data of our gene-based association analysis suggest that a link between trained immunity and oncological outcome may exist. Future studies should further investigate how trained immunity affects the antitumor immune responses in BCG-treated NMIBC patients.
Collapse
Affiliation(s)
- Jelmer H van Puffelen
- Department of Internal Medicine, Radboudumc, Nijmegen, The Netherlands,Department for Health Evidence, Radboudumc, Nijmegen, The Netherlands
| | - Boris Novakovic
- Department of Paediatrics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Liesbeth van Emst
- Department of Internal Medicine, Radboudumc, Nijmegen, The Netherlands
| | - Denise Kooper
- Department of Urology, Erasmus MC Cancer Centre, Rotterdam, The Netherlands
| | | | | | - J Alfred Witjes
- Department of Urology, Radboudumc, Nijmegen, The Netherlands
| | | | - Alina Vrieling
- Department for Health Evidence, Radboudumc, Nijmegen, The Netherlands
| | - Katja K H Aben
- Department for Health Evidence, Radboudumc, Nijmegen, The Netherlands,IKNL, Utrecht, The Netherlands
| | | | | | - Mihai G Netea
- Department of Internal Medicine, Radboudumc, Nijmegen, The Netherlands,Department of Immunology and Metabolism, University of Bonn, Life & Medical Sciences Institute, Bonn, Germany
| | - Joost L Boormans
- Department of Urology, Erasmus MC Cancer Centre, Rotterdam, The Netherlands
| | | | - Leo A B Joosten
- Department of Internal Medicine, Radboudumc, Nijmegen, The Netherlands,Department of Medical Genetics, Iuliu Hațieganu University of Medicine and Pharmacy, Cluj-Napoca, Romania
| | - Sita H Vermeulen
- Department for Health Evidence, Radboudumc, Nijmegen, The Netherlands
| |
Collapse
|
6
|
Shao Z, Wang T, Qiao J, Zhang Y, Huang S, Zeng P. A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies. BMC Bioinformatics 2022; 23:359. [PMID: 36042399 PMCID: PMC9429742 DOI: 10.1186/s12859-022-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Multilocus analysis on a set of single nucleotide polymorphisms (SNPs) pre-assigned within a gene constitutes a valuable complement to single-marker analysis by aggregating data on complex traits in a biologically meaningful way. However, despite the existence of a wide variety of SNP-set methods, few comprehensive comparison studies have been previously performed to evaluate the effectiveness of these methods. RESULTS We herein sought to fill this knowledge gap by conducting a comprehensive empirical comparison for 22 commonly-used summary-statistics based SNP-set methods. We showed that only seven methods could effectively control the type I error, and that these well-calibrated approaches had varying power performance under the simulation scenarios. Overall, we confirmed that the burden test was generally underpowered and score-based variance component tests (e.g., sequence kernel association test) were much powerful under the polygenic genetic architecture in both common and rare variant association analyses. We further revealed that two linkage-disequilibrium-free P value combination methods (e.g., harmonic mean P value method and aggregated Cauchy association test) behaved very well under the sparse genetic architecture in simulations and real-data applications to common and rare variant association analyses as well as in expression quantitative trait loci weighted integrative analysis. We also assessed the scalability of these approaches by recording computational time and found that all these methods can be scalable to biobank-scale data although some might be relatively slow. CONCLUSION In conclusion, we hope that our findings can offer an important guidance on how to choose appropriate multilocus association analysis methods in post-GWAS era. All the SNP-set methods are implemented in the R package called MCA, which is freely available at https://github.com/biostatpzeng/ .
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yuchen Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
7
|
Lu H, Wei Y, Jiang Z, Zhang J, Wang T, Huang S, Zeng P. Integrative eQTL-weighted hierarchical Cox models for SNP-set based time-to-event association studies. J Transl Med 2021; 19:418. [PMID: 34627275 PMCID: PMC8502405 DOI: 10.1186/s12967-021-03090-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 09/26/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Integrating functional annotations into SNP-set association studies has been proven a powerful analysis strategy. Statistical methods for such integration have been developed for continuous and binary phenotypes; however, the SNP-set integrative approaches for time-to-event or survival outcomes are lacking. METHODS We here propose IEHC, an integrative eQTL (expression quantitative trait loci) hierarchical Cox regression, for SNP-set based survival association analysis by modeling effect sizes of genetic variants as a function of eQTL via a hierarchical manner. Three p-values combination tests are developed to examine the joint effects of eQTL and genetic variants after a novel decorrelated modification of statistics for the two components. An omnibus test (IEHC-ACAT) is further adapted to aggregate the strengths of all available tests. RESULTS Simulations demonstrated that the IEHC joint tests were more powerful if both eQTL and genetic variants contributed to association signal, while IEHC-ACAT was robust and often outperformed other approaches across various simulation scenarios. When applying IEHC to ten TCGA cancers by incorporating eQTL from relevant tissues of GTEx, we revealed that substantial correlations existed between the two types of effect sizes of genetic variants from TCGA and GTEx, and identified 21 (9 unique) cancer-associated genes which would otherwise be missed by approaches not incorporating eQTL. CONCLUSION IEHC represents a flexible, robust, and powerful approach to integrate functional omics information to enhance the power of identifying association signals for the survival risk of complex human cancers.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yongyue Wei
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, Jiangsu, China
| | - Zhou Jiang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jinhui Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
8
|
Shao Z, Wang T, Zhang M, Jiang Z, Huang S, Zeng P. IUSMMT: Survival mediation analysis of gene expression with multiple DNA methylation exposures and its application to cancers of TCGA. PLoS Comput Biol 2021; 17:e1009250. [PMID: 34464378 PMCID: PMC8437300 DOI: 10.1371/journal.pcbi.1009250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 09/13/2021] [Accepted: 07/06/2021] [Indexed: 02/07/2023] Open
Abstract
Effective and powerful survival mediation models are currently lacking. To partly fill such knowledge gap, we particularly focus on the mediation analysis that includes multiple DNA methylations acting as exposures, one gene expression as the mediator and one survival time as the outcome. We proposed IUSMMT (intersection-union survival mixture-adjusted mediation test) to effectively examine the existence of mediation effect by fitting an empirical three-component mixture null distribution. With extensive simulation studies, we demonstrated the advantage of IUSMMT over existing methods. We applied IUSMMT to ten TCGA cancers and identified multiple genes that exhibited mediating effects. We further revealed that most of the identified regions, in which genes behaved as active mediators, were cancer type-specific and exhibited a full mediation from DNA methylation CpG sites to the survival risk of various types of cancers. Overall, IUSMMT represents an effective and powerful alternative for survival mediation analysis; our results also provide new insights into the functional role of DNA methylation and gene expression in cancer progression/prognosis and demonstrate potential therapeutic targets for future clinical practice.
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Meng Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Zhou Jiang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, China
| |
Collapse
|
9
|
Lakhal-Chaieb L, Simard J, Bull S. Sequence kernel association test for survival outcomes in the presence of a non-susceptible fraction. Biostatistics 2021; 21:518-530. [PMID: 30590388 DOI: 10.1093/biostatistics/kxy075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Revised: 10/23/2018] [Accepted: 10/25/2018] [Indexed: 11/13/2022] Open
Abstract
In this work, we propose a single nucleotide polymorphism set association test for survival phenotypes in the presence of a non-susceptible fraction. We consider a mixture model with a logistic regression for the susceptibility indicator and a proportional hazards regression to model survival in the susceptible group. We propose a joint test to assess the significance of the genetic variant in both logistic and survival regressions simultaneously. We adopt the spirit of SKAT and conduct a variance-component test treating the genetic effects of multiple variants as random. We derive score-type test statistics, and we investigate several approaches to compute their $p$-values. The finite-sample properties of the proposed tests are assessed and compared to existing approaches by simulations and their use is illustrated through an application to ovarian cancer data from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2.
Collapse
Affiliation(s)
- Lajmi Lakhal-Chaieb
- Département de mathématiques et de statistique, Université Laval, 1045 de la médecine, Québec G1V 0A6, Canada
| | - Jacques Simard
- Département de médecine moléculaire, Chaire de recherche du Canada en encogénétique, Université Laval, Québec G1V 0A6, Canada
| | - Shelley Bull
- Dalla Lana School of Public Health, University of Toronto, 6th floor, Health Sciences Building, 155 College Street, Toronto, Ontario M5T3M7 Canada.,The Lunenberg-Tanenbaum Research Institute, Sinai Health System, 60 Murray Street, Toronto, Ontario M5T 3L9 Canada
| |
Collapse
|
10
|
Zhang L, Kim I. Finite mixtures of semiparametric Bayesian survival kernel machine regressions: Application to breast cancer gene pathway subgroup analysis. J R Stat Soc Ser C Appl Stat 2020. [DOI: 10.1111/rssc.12457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lin Zhang
- Department of Statistics Virginia Tech Blacksburg VAUSA
| | - Inyoung Kim
- Department of Statistics Virginia Tech Blacksburg VAUSA
| |
Collapse
|
11
|
Bi W, Fritsche LG, Mukherjee B, Kim S, Lee S. A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank. Am J Hum Genet 2020; 107:222-233. [PMID: 32589924 DOI: 10.1016/j.ajhg.2020.06.003] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 06/03/2020] [Indexed: 12/09/2022] Open
Abstract
With increasing biobanking efforts connecting electronic health records and national registries to germline genetics, the time-to-event data analysis has attracted increasing attention in the genetics studies of human diseases. In time-to-event data analysis, the Cox proportional hazards (PH) regression model is one of the most used approaches. However, existing methods and tools are not scalable when analyzing a large biobank with hundreds of thousands of samples and endpoints, and they are not accurate when testing low-frequency and rare variants. Here, we propose a scalable and accurate method, SPACox (a saddlepoint approximation implementation based on the Cox PH regression model), that is applicable for genome-wide scale time-to-event data analysis. SPACox requires fitting a Cox PH regression model only once across the genome-wide analysis and then uses a saddlepoint approximation (SPA) to calibrate the test statistics. Simulation studies show that SPACox is 76-252 times faster than other existing alternatives, such as gwasurvivr, 185-511 times faster than the standard Wald test, and more than 6,000 times faster than the Firth correction and can control type I error rates at the genome-wide significance level regardless of minor allele frequencies. Through the analysis of UK Biobank inpatient data of 282,871 white British European ancestry samples, we show that SPACox can efficiently analyze large sample sizes and accurately control type I error rates. We identified 611 loci associated with time-to-event phenotypes of 12 common diseases, of which 38 loci would be missed within a logistic regression framework with a binary phenotype defined as event occurrence status during the follow-up period.
Collapse
|
12
|
Zhao N, Zhang H, Clark JJ, Maity A, Wu MC. Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect. Biometrics 2019; 75:625-637. [DOI: 10.1111/biom.13003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Accepted: 10/09/2018] [Indexed: 12/17/2022]
Affiliation(s)
- Ni Zhao
- Department of BiostatisticsJohns Hopkins UniversityBaltimore, Maryland
| | - Haoyu Zhang
- Department of BiostatisticsJohns Hopkins UniversityBaltimore, Maryland
| | - Jennifer J. Clark
- Department of BiostatisticsUniversity of North Carolina at Chapel HillChapel Hill, North Carolina
| | - Arnab Maity
- Department of StatisticsNorth Carolina State UniversityRaleigh, North Carolina
| | - Michael C. Wu
- Public Health Sciences Division,Fred Hutchinson Cancer Research CenterSeattle, Washington
| |
Collapse
|
13
|
Marceau West R, Lu W, Rotroff DM, Kuenemann MA, Chang SM, Wu MC, Wagner MJ, Buse JB, Motsinger-Reif AA, Fourches D, Tzeng JY. Identifying individual risk rare variants using protein structure guided local tests (POINT). PLoS Comput Biol 2019; 15:e1006722. [PMID: 30779729 PMCID: PMC6396946 DOI: 10.1371/journal.pcbi.1006722] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 03/01/2019] [Accepted: 12/17/2018] [Indexed: 01/08/2023] Open
Abstract
Rare variants are of increasing interest to genetic association studies because of their etiological contributions to human complex diseases. Due to the rarity of the mutant events, rare variants are routinely analyzed on an aggregate level. While aggregation analyses improve the detection of global-level signal, they are not able to pinpoint causal variants within a variant set. To perform inference on a localized level, additional information, e.g., biological annotation, is often needed to boost the information content of a rare variant. Following the observation that important variants are likely to cluster together on functional domains, we propose a protein structure guided local test (POINT) to provide variant-specific association information using structure-guided aggregation of signal. Constructed under a kernel machine framework, POINT performs local association testing by borrowing information from neighboring variants in the 3-dimensional protein space in a data-adaptive fashion. Besides merely providing a list of promising variants, POINT assigns each variant a p-value to permit variant ranking and prioritization. We assess the selection performance of POINT using simulations and illustrate how it can be used to prioritize individual rare variants in PCSK9, ANGPTL4 and CETP in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) clinical trial data.
Collapse
Affiliation(s)
- Rachel Marceau West
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Daniel M. Rotroff
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Melaine A. Kuenemann
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Sheng-Mao Chang
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
| | - Michael C. Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Michael J. Wagner
- Center for Pharmacogenomics and Individualized Therapy, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - John B. Buse
- Department of Medicine, University of North Carolina School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Alison A. Motsinger-Reif
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Denis Fourches
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
14
|
Qi W, Allen AS, Li YJ. Family-based association tests for rare variants with censored traits. PLoS One 2019; 14:e0210870. [PMID: 30682063 PMCID: PMC6347269 DOI: 10.1371/journal.pone.0210870] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 12/27/2018] [Indexed: 11/30/2022] Open
Abstract
We propose a set of family-based burden and kernel tests for censored traits (FamBAC and FamKAC). Here, censored traits refer to time-to-event outcomes, for instance, age-at-onset of a disease. To model censored traits in family-based designs, we used the frailty model, which incorporated not only fixed genetic effects of rare variants in a region of interest but also random polygenic effects shared within families. We first partitioned genotype scores of rare variants into orthogonal between- and within-family components, and then derived their corresponding efficient score statistics from the frailty model. Finally, FamBAC and FamKAC were constructed by aggregating the weighted efficient scores of the within-family components across rare variants and subjects. FamBAC collapsed rare variants within subject first to form a burden test that followed a chi-squared distribution; whereas FamKAC was a variant component test following a mixture of chi-squared distributions. For FamKAC, p-values can be computed by permutation tests or for computational efficiency by approximation methods. Through simulation studies, we showed that type I error was correctly controlled by FamBAC for various variant weighting schemes (0.0371 to 0.0527). However, FamKAC type I error rates based on approximation methods were deflated (max 0.0376) but improved by permutation tests. Our simulations also demonstrated that burden test FamBAC had higher power than kernel test FamKAC when high proportion (e.g. ≥ 80%) of causal variants had effects in the same direction. In contrast, when the effects of causal variants on the censored trait were in mixed directions, FamKAC outperformed FamBAC and had comparable or higher power than an existing method, RVFam. Our proposed framework has the flexibility to accommodate general nuclear families, and can be used to analyze sequence data for censored traits such as age-at-onset of a complex disease of interest.
Collapse
Affiliation(s)
- Wenjing Qi
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Duke Molecular Physiology Institute, Duke University, Durham, NC, United States of America
| | - Andrew S. Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Center for Statistical Genetics and Genomics, Duke University, Durham, NC, United States of America
| | - Yi-Ju Li
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Duke Molecular Physiology Institute, Duke University, Durham, NC, United States of America
- * E-mail:
| |
Collapse
|
15
|
Larson NB, Chen J, Schaid DJ. A review of kernel methods for genetic association studies. Genet Epidemiol 2019; 43:122-136. [PMID: 30604442 DOI: 10.1002/gepi.22180] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 11/09/2018] [Accepted: 11/26/2018] [Indexed: 12/17/2022]
Abstract
Evaluating the association of multiple genetic variants with a trait of interest by use of kernel-based methods has made a significant impact on how genetic association analyses are conducted. An advantage of kernel methods is that they tend to be robust when the genetic variants have effects that are a mixture of positive and negative effects, as well as when there is a small fraction of causal variants. Another advantage is that kernel methods fit within the framework of mixed models, providing flexible ways to adjust for additional covariates that influence traits. Herein, we review the basic ideas behind the use of kernel methods for genetic association analysis as well as recent methodological advancements for different types of traits, multivariate traits, pedigree data, and longitudinal data. Finally, we discuss opportunities for future research.
Collapse
Affiliation(s)
- Nicholas B Larson
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| | - Jun Chen
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| | - Daniel J Schaid
- Department of Health Sciences Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
16
|
An adaptive microbiome α-diversity-based association analysis method. Sci Rep 2018; 8:18026. [PMID: 30575793 PMCID: PMC6303306 DOI: 10.1038/s41598-018-36355-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Accepted: 11/19/2018] [Indexed: 12/12/2022] Open
Abstract
To relate microbial diversity with various host traits of interest (e.g., phenotypes, clinical interventions, environmental factors) is a critical step for generic assessments about the disparity in human microbiota among different populations. The performance of the current item-by-item α-diversity-based association tests is sensitive to the choice of α-diversity metric and unpredictable due to the unknown nature of the true association. The approach of cherry-picking a test for the smallest p-value or the largest effect size among multiple item-by-item analyses is not even statistically valid due to the inherent multiplicity issue. Investigators have recently introduced microbial community-level association tests while blustering statistical power increase of their proposed methods. However, they are purely a test for significance which does not provide any estimation facilities on the effect direction and size of a microbial community; hence, they are not in practical use. Here, I introduce a novel microbial diversity association test, namely, adaptive microbiome α-diversity-based association analysis (aMiAD). aMiAD simultaneously tests the significance and estimates the effect score of the microbial diversity on a host trait, while robustly maintaining high statistical power and accurate estimation with no issues in validity.
Collapse
|
17
|
He T, Li S, Zhong PS, Cui Y. An optimal kernel-based U
-statistic method for quantitative gene-set association analysis. Genet Epidemiol 2018; 43:137-149. [DOI: 10.1002/gepi.22170] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 08/19/2018] [Accepted: 09/26/2018] [Indexed: 11/09/2022]
Affiliation(s)
- Tao He
- Department of Mathematics; San Francisco State University; San Francisco California
| | - Shaoyu Li
- Department of Mathematics and Statistics; University of North Carolina at Charlotte; Charlotte North Carolina
| | - Ping-Shou Zhong
- Department of Mathematics, Statistics, and Computer Science; University of Illinois at Chicago; Chicago Illinois
| | - Yuehua Cui
- Department of Statistics & Probability; Michigan State University; East Lansing Michigan
- School of Public Health, Zhengzhou University; Zhengzhou China
| |
Collapse
|
18
|
Goodman MO, Chibnik L, Cai T. Variance components genetic association test for zero-inflated count outcomes. Genet Epidemiol 2018; 43:82-101. [PMID: 30353568 DOI: 10.1002/gepi.22162] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 05/15/2018] [Accepted: 06/12/2018] [Indexed: 01/27/2023]
Abstract
Commonly in biomedical research, studies collect data in which an outcome measure contains informative excess zeros; for example, when observing the burden of neuritic plaques (NPs) in brain pathology studies, those who show none contribute to our understanding of neurodegenerative disease. The outcome may be characterized by a mixture distribution with one component being the "structural zero" and the other component being a Poisson distribution. We propose a novel variance components score test of genetic association between a set of genetic markers and a zero-inflated count outcome from a mixture distribution. This test shares advantageous properties with single-nucleotide polymorphism (SNP)-set tests which have been previously devised for standard continuous or binary outcomes, such as the sequence kernel association test. In particular, our method has superior statistical power compared to competing methods, especially when there is correlation within the group of markers, and when the SNPs are associated with both the mixing proportion and the rate of the Poisson distribution. We apply the method to Alzheimer's data from the Rush University Religious Orders Study and Memory and Aging Project, where as proof of principle we find highly significant associations with the APOE gene, in both the "structural zero" and "count" parameters, when applied to a zero-inflated NPs count outcome.
Collapse
Affiliation(s)
- Matthew O Goodman
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Lori Chibnik
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
19
|
Zhang L, Kim I. Semiparametric Bayesian kernel survival model for evaluating pathway effects. Stat Methods Med Res 2018; 28:3301-3317. [PMID: 30289021 DOI: 10.1177/0962280218797360] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Massive amounts of high-dimensional data have been accumulated over the past two decades, which has cultured increasing interests in identifying gene pathways related to certain biological processes. In particular, since pathway-based analysis has the ability to detect subtle changes of differentially expressed genes that could be missed when using gene-based analysis, detecting the gene pathways that regulate certain diseases can provide new strategies for medical procedures and new targets for drug discovery. Limited work has been carried out, primarily in regression settings, to study the effects of pathways on survival outcomes. Motivated by a breast cancer gene-pathway data set, which exhibits the "small n, large p" characteristics, we propose a semiparametric Bayesian kernel survival model (s-BKSurv) to study the effects of both clinical covariates and gene expression levels within a pathway on survival time. We model the unknown high-dimensional functions of pathways via Gaussian kernel machine to consider the possibility that genes within the same pathway interact with each other. To address the multiple comparisons problem under a full Bayesian setting, we propose a similarity-dependent procedure based on Bayes factor to control the family-wise error rate. We demonstrate the outperformance of our approach under various simulation settings and pathways data.
Collapse
Affiliation(s)
- Lin Zhang
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Inyoung Kim
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| |
Collapse
|
20
|
Fang YH, Wang JH, Hsiung CA. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions. Bioinformatics 2018. [PMID: 28651334 DOI: 10.1093/bioinformatics/btx409] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation Identification of single nucleotide polymorphism (SNP) interactions is an important and challenging topic in genome-wide association studies (GWAS). Many approaches have been applied to detecting whole-genome interactions. However, these approaches to interaction analysis tend to miss causal interaction effects when the individual marginal effects are uncorrelated to trait, while their interaction effects are highly associated with the trait. Results A grouped variable selection technique, called two-stage grouped sure independence screening (TS-GSIS), is developed to study interactions that may not have marginal effects. The proposed TS-GSIS is shown to be very helpful in identifying not only causal SNP effects that are uncorrelated to trait but also their corresponding SNP-SNP interaction effects. The benefit of TS-GSIS are gaining detection of interaction effects by taking the joint information among the SNPs and determining the size of candidate sets in the model. Simulation studies under various scenarios are performed to compare performance of TS-GSIS and current approaches. We also apply our approach to a real rheumatoid arthritis (RA) dataset. Both the simulation and real data studies show that the TS-GSIS performs very well in detecting SNP-SNP interactions. Availability and implementation R-package is delivered through CRAN and is available at: https://cran.r-project.org/web/packages/TSGSIS/index.html. Contact hsiung@nhri.org.tw. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yao-Hwei Fang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Jie-Huei Wang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Chao A Hsiung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| |
Collapse
|
21
|
Wang K. Conditional asymptotic inference for the kernel association test. Bioinformatics 2018; 33:3733-3739. [PMID: 28961861 DOI: 10.1093/bioinformatics/btx511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 08/08/2017] [Indexed: 11/14/2022] Open
Abstract
Motivation The kernel association test (KAT) is popular in biological studies for its ability to combine weak effects potentially of opposite direction. Its P-value is typically assessed via its (unconditional) asymptotic distribution. However, such an asymptotic distribution is known only for continuous traits and for dichotomous traits. Furthermore, the derived P-values are known to be conservative when sample size is small, especially for the important case of dichotomous traits. One alternative is the permutation test, a widely accepted approximation to the exact finite sample conditional inference. But it is time-consuming to use in practice due to stringent significance criteria commonly seen in these analyses. Results Based on a previous theoretical result a conditional asymptotic distribution for the KAT is introduced. This distribution provides an alternative approximation to the exact distribution of the KAT. An explicit expression of this distribution is provided from which P-values can be easily computed. This method applies to any type of traits. The usefulness of this approach is demonstrated via extensive simulation studies using real genotype data and an analysis of genetic data from the Ocular Hypertension Treatment Study. Numerical results showed that the new method can control the type I error rate and is a bit conservative when compared to the permutation method. Nevertheless the proposed method may be used as a fast screening method. A time-consuming permutation procedure may be conducted at locations that show signals of association. Availability and implementation An implementation of the proposed method is provided in the R package iGasso. Contact kai-wang@uiowa.edu.
Collapse
Affiliation(s)
- Kai Wang
- Department of Biostatistics, University of Iowa, Iowa City, IA 52246, USA
| |
Collapse
|
22
|
Agniel D, Hejblum BP. Variance component score test for time-course gene set analysis of longitudinal RNA-seq data. Biostatistics 2018; 18:589-604. [PMID: 28334305 DOI: 10.1093/biostatistics/kxx005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 01/04/2017] [Indexed: 01/28/2023] Open
Abstract
As gene expression measurement technology is shifting from microarrays to sequencing, the statistical tools available for their analysis must be adapted since RNA-seq data are measured as counts. It has been proposed to model RNA-seq counts as continuous variables using nonparametric regression to account for their inherent heteroscedasticity. In this vein, we propose tcgsaseq, a principled, model-free, and efficient method for detecting longitudinal changes in RNA-seq gene sets defined a priori. The method identifies those gene sets whose expression varies over time, based on an original variance component score test accounting for both covariates and heteroscedasticity without assuming any specific parametric distribution for the (transformed) counts. We demonstrate that despite the presence of a nonparametric component, our test statistic has a simple form and limiting distribution, and both may be computed quickly. A permutation version of the test is additionally proposed for very small sample sizes. Applied to both simulated data and two real datasets, tcgsaseq is shown to exhibit very good statistical properties, with an increase in stability and power when compared to state-of-the-art methods ROAST (rotation gene set testing), edgeR, and DESeq2, which can fail to control the type I error under certain realistic settings. We have made the method available for the community in the R package tcgsaseq.
Collapse
Affiliation(s)
- Denis Agniel
- Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, MA 02115, USA
| | - Boris P Hejblum
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA University of Bordeaux, ISPED, INSERM U1219, INRIA SISTM, 146 rue Léo Saignat, 33076 Bordeaux, FRANCE Vaccine Research Institute, Créteil, FRANCE
| |
Collapse
|
23
|
Koh H, Livanos AE, Blaser MJ, Li H. A highly adaptive microbiome-based association test for survival traits. BMC Genomics 2018; 19:210. [PMID: 29558893 PMCID: PMC5859547 DOI: 10.1186/s12864-018-4599-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 03/13/2018] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND There has been increasing interest in discovering microbial taxa that are associated with human health or disease, gathering momentum through the advances in next-generation sequencing technologies. Investigators have also increasingly employed prospective study designs to survey survival (i.e., time-to-event) outcomes, but current item-by-item statistical methods have limitations due to the unknown true association pattern. Here, we propose a new adaptive microbiome-based association test for survival outcomes, namely, optimal microbiome-based survival analysis (OMiSA). OMiSA approximates to the most powerful association test in two domains: 1) microbiome-based survival analysis using linear and non-linear bases of OTUs (MiSALN) which weighs rare, mid-abundant, and abundant OTUs, respectively, and 2) microbiome regression-based kernel association test for survival traits (MiRKAT-S) which incorporates different distance metrics (e.g., unique fraction (UniFrac) distance and Bray-Curtis dissimilarity), respectively. RESULTS We illustrate that OMiSA powerfully discovers microbial taxa whether their underlying associated lineages are rare or abundant and phylogenetically related or not. OMiSA is a semi-parametric method based on a variance-component score test and a re-sampling method; hence, it is free from any distributional assumption on the effect of microbial composition and advantageous to robustly control type I error rates. Our extensive simulations demonstrate the highly robust performance of OMiSA. We also present the use of OMiSA with real data applications. CONCLUSIONS OMiSA is attractive in practice as the true association pattern is unpredictable in advance and, for survival outcomes, no adaptive microbiome-based association test is currently available.
Collapse
Affiliation(s)
- Hyunwook Koh
- Department of Population Health, New York University School of Medicine, 650 First Avenue, Room 547, New York, NY, 10016, USA
| | - Alexandra E Livanos
- Department of Medicine, Columbia University Medical Center, New York, NY, 10032, USA
| | - Martin J Blaser
- Departments of Medicine and Microbiology, New York University School of Medicine, New York, NY, 10016, USA.,Medical Service, New York Harbor Department of Veterans Affairs Medical Center, New York, NY, 10010, USA
| | - Huilin Li
- Department of Population Health, New York University School of Medicine, 650 First Avenue, Room 547, New York, NY, 10016, USA.
| |
Collapse
|
24
|
Jacobs DI, Liu Y, Gabrusiewicz K, Tsavachidis S, Armstrong GN, Zhou R, Wei J, Ivan C, Calin G, Molinaro AM, Rice T, Bracci PM, Hansen HM, Wiencke JK, Wrensch MR, Heimberger AB, Bondy ML. Germline polymorphisms in myeloid-associated genes are not associated with survival in glioma patients. J Neurooncol 2018; 136:33-39. [PMID: 28965162 PMCID: PMC5756111 DOI: 10.1007/s11060-017-2622-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 09/08/2017] [Indexed: 01/07/2023]
Abstract
Immune cells of myeloid origin, including microglia, macrophages, and myeloid-derived suppressor cells adopt immunosuppressive phenotypes that support gliomagenesis. Here, we tested an a priori hypothesis that single nucleotide polymorphisms (SNPs) in genes related to glioma-associated myeloid cell regulation and function are also associated with patient survival after glioma diagnosis. Subjects for this study were 992 glioma patients treated at The University of Texas MD Anderson Cancer Center in Houston, Texas between 1992 and 2008. Haplotype-tagging SNPs in 91 myeloid-associated genes were analyzed for association with survival by Cox regression. Individual SNP- and gene-based tests were performed separately in glioblastoma (WHO grade IV, n = 511) and lower-grade glioma (WHO grade II-III, n = 481) groups. After adjustment for multiple testing, no myeloid-associated gene variants were significantly associated with survival in glioblastoma. Two SNPs, rs147960238 in CD163 (p = 2.2 × 10-5) and rs17138945 in MET (p = 5.6 × 10-5) were significantly associated with survival of patients with lower-grade glioma. However, these associations were not confirmed in an independent analysis of 563 lower-grade glioma cases from the University of California at San Francisco Adult Glioma Study (p = 0.65 and p = 0.41, respectively). The results of this study do not support a role for inherited polymorphisms in myeloid-associated genes in affecting survival of patients diagnosed with glioblastoma or lower-grade glioma.
Collapse
Affiliation(s)
- Daniel I Jacobs
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA
| | - Yanhong Liu
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA
| | - Konrad Gabrusiewicz
- Department of Neurosurgery, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Spiridon Tsavachidis
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA
| | - Georgina N Armstrong
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA
| | - Renke Zhou
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA
| | - Jun Wei
- Department of Neurosurgery, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Cristina Ivan
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - George Calin
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Annette M Molinaro
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Terri Rice
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Paige M Bracci
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Helen M Hansen
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - John K Wiencke
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Margaret R Wrensch
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Amy B Heimberger
- Department of Neurosurgery, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
| | - Melissa L Bondy
- Department of Medicine, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, One Baylor Plaza, Mailstop BCM305, Houston, TX, 77030, USA.
| |
Collapse
|
25
|
Abstract
While genome-wide association studies have been very successful in identifying associations of common genetic variants with many different traits, the rarer frequency spectrum of the genome has not yet been comprehensively explored. Technological developments increasingly lift restrictions to access rare genetic variation. Dense reference panels enable improved genotype imputation for rarer variants in studies using DNA microarrays. Moreover, the decreasing cost of next generation sequencing makes whole exome and genome sequencing increasingly affordable for large samples. Large-scale efforts based on sequencing, such as ExAC, 100,000 Genomes, and TopMed, are likely to significantly advance this field.The main challenge in evaluating complex trait associations of rare variants is statistical power. The choice of population should be considered carefully because allele frequencies and linkage disequilibrium structure differ between populations. Genetically isolated populations can have favorable genomic characteristics for the study of rare variants.One strategy to increase power is to assess the combined effect of multiple rare variants within a region, known as aggregate testing. A range of methods have been developed for this. Model performance depends on the genetic architecture of the region of interest.
Collapse
Affiliation(s)
- Karoline Kuchenbaecker
- Wellcome Trust Sanger Institute, Cambridge, UK. .,University College London, London, UK.
| | - Emil Vincent Rosenbaum Appel
- Novo Nordisk Foundation Center for Basic Metabolic Research, Section for Metabolic Genetics, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
26
|
Zhao N, Zhan X, Huang YT, Almli LM, Smith A, Epstein MP, Conneely K, Wu MC. Kernel machine methods for integrative analysis of genome-wide methylation and genotyping studies. Genet Epidemiol 2017; 42:156-167. [DOI: 10.1002/gepi.22100] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 09/26/2017] [Accepted: 10/27/2017] [Indexed: 12/22/2022]
Affiliation(s)
- Ni Zhao
- Department of Biostatistics; Johns Hopkins University; Baltimore Maryland 21205 United States of America
| | - Xiang Zhan
- Department of Public Health Sciences; Pennsylvania State University; Hershey Pennsylvania 17033 United States of America
| | - Yen-Tsung Huang
- Institute of Statistical Science; Academia Sinica; Taipei 11529 Taiwan
| | - Lynn M Almli
- Department of Psychiatry and Behavioral Sciences; Emory University; Atlanta Georgia 30322 United States of America
| | - Alicia Smith
- Department of Gynecology and Obstetrics; Emory University; Atlanta Georgia 30322 United States of America
| | - Michael P. Epstein
- Department of Human Genetics; Emory University; Atlanta Georgia 30322 United States of America
| | - Karen Conneely
- Department of Human Genetics; Emory University; Atlanta Georgia 30322 United States of America
| | - Michael C. Wu
- Public Health Sciences; Fred Hutchinson Cancer Research Center; Seattle Washington 98109 United States of America
| |
Collapse
|
27
|
Dimitrakopoulou VI, Travis RC, Shui IM, Mondul A, Albanes D, Virtamo J, Agudo A, Boeing H, Bueno-de-Mesquita HB, Gunter MJ, Johansson M, Khaw KT, Overvad K, Palli D, Trichopoulou A, Giovannucci E, Hunter DJ, Lindström S, Willett W, Gaziano JM, Stampfer M, Berg C, Berndt SI, Black A, Hoover RN, Kraft P, Key TJ, Tsilidis KK. Interactions Between Genome-Wide Significant Genetic Variants and Circulating Concentrations of 25-Hydroxyvitamin D in Relation to Prostate Cancer Risk in the National Cancer Institute BPC3. Am J Epidemiol 2017; 185:452-464. [PMID: 28399564 PMCID: PMC5856084 DOI: 10.1093/aje/kww143] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 03/01/2016] [Indexed: 02/06/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified over 100 single nucleotide polymorphisms (SNPs) associated with prostate cancer. However, information on the mechanistic basis for some associations is limited. Recent research has been directed towards the potential association of vitamin D concentrations and prostate cancer, but little is known about whether the aforementioned genetic associations are modified by vitamin D. We investigated the associations of 46 GWAS-identified SNPs, circulating concentrations of 25-hydroxyvitamin D (25(OH)D), and prostate cancer (3,811 cases, 511 of whom died from the disease, compared with 2,980 controls-from 5 cohort studies that recruited participants over several periods beginning in the 1980s). We used logistic regression models with data from the National Cancer Institute Breast and Prostate Cancer Cohort Consortium (BPC3) to evaluate interactions on the multiplicative and additive scales. After allowing for multiple testing, none of the SNPs examined was significantly associated with 25(OH)D concentration, and the SNP-prostate cancer associations did not differ by these concentrations. A statistically significant interaction was observed for each of 2 SNPs in the 8q24 region (rs620861 and rs16902094), 25(OH)D concentration, and fatal prostate cancer on both multiplicative and additive scales (P ≤ 0.001). We did not find strong evidence that associations between GWAS-identified SNPs and prostate cancer are modified by circulating concentrations of 25(OH)D. The intriguing interactions between rs620861 and rs16902094, 25(OH)D concentration, and fatal prostate cancer warrant replication.
Collapse
Affiliation(s)
- Vasiliki I. Dimitrakopoulou
- Correspondence to Dr. Vasiliki I. Dimitrakopoulou, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Stavros Niarchos Avenue, University Campus, Ioannina, Greece (e-mail: )
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Plantinga A, Zhan X, Zhao N, Chen J, Jenq RR, Wu MC. MiRKAT-S: a community-level test of association between the microbiota and survival times. MICROBIOME 2017; 5:17. [PMID: 28179014 PMCID: PMC5299808 DOI: 10.1186/s40168-017-0239-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 01/31/2017] [Indexed: 05/17/2023]
Abstract
BACKGROUND Community-level analysis of the human microbiota has culminated in the discovery of relationships between overall shifts in the microbiota and a wide range of diseases and conditions. However, existing work has primarily focused on analysis of relatively simple dichotomous or quantitative outcomes, for example, disease status or biomarker levels. Recently, there is also considerable interest in the relationship between the microbiota and censored survival outcomes, such as in clinical trials. How to conduct community-level analysis with censored survival outcomes is unclear, since standard dissimilarity-based tests cannot accommodate censored survival times and no alternative methods exist. METHODS We develop a new approach, MiRKAT-S, for community-level analysis of microbiome data with censored survival times. MiRKAT-S uses ecologically informative distance metrics, such as the UniFrac distances, to generate matrices of pairwise distances between individuals' taxonomic profiles. The distance matrices are transformed into kernel (similarity) matrices, which are used to compare similarity in the microbiota to similarity in survival times between individuals. RESULTS Simulation studies using synthetic microbial communities demonstrate correct control of type I error and adequate power. We also apply MiRKAT-S to examine the relationship between the gut microbiota and survival after allogeneic blood or bone marrow transplant. CONCLUSIONS We present MiRKAT-S, a method that facilitates community-level analysis of the association between the microbiota and survival outcomes and therefore provides a new approach to analysis of microbiome data arising from clinical trials.
Collapse
Affiliation(s)
- Anna Plantinga
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington USA
| | - Xiang Zhan
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington USA
| | - Ni Zhao
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 N Wolfe St, Baltimore, Maryland USA
| | - Jun Chen
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, Minnesota USA
| | - Robert R. Jenq
- Departments of Genomic Medicine and Stem Cell Transplantation, Division of Cancer Medicine, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Houston, Unit 1954 TX USA
| | - Michael C. Wu
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington USA
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington USA
| |
Collapse
|
29
|
Zhu M, Geng L, Shen W, Wang Y, Liu J, Cheng Y, Wang C, Dai J, Jin G, Hu Z, Ma H, Shen H. Exome-Wide Association Study Identifies Low-Frequency Coding Variants in 2p23.2 and 7p11.2 Associated with Survival of Non-Small Cell Lung Cancer Patients. J Thorac Oncol 2017; 12:644-656. [PMID: 28104536 DOI: 10.1016/j.jtho.2016.12.025] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Revised: 11/23/2016] [Accepted: 12/15/2016] [Indexed: 01/10/2023]
Abstract
INTRODUCTION A growing body of evidence has suggested that low-frequency or rare coding variants might have strong effects on the development and prognosis of cancer. Here, we aim to assess the role of low-frequency and rare coding variants in the survival of NSCLC in Chinese populations. METHODS We performed an exome-wide scan of 247,870 variants in 1008 patients with NSCLC and replicated the promising variants by using imputed genotype data of The Cancer Genome Atlas (TCGA) with a Cox regression model. Gene-based and pathway-based analysis were also performed for nonsynonymous or splice site variants. Additionally, analysis of gene expression data in the TCGA was used to increase the reliability of candidate loci and genes. RESULTS A low-frequency missense variant in chaperonin containing TCP1 subunit 6A gene (CCT6A) (rs33922584: adjusted hazard ratio [HRadjusted] = 1.75, p = 6.06 × 10-4) was significantly related to the survival of patients with NSCLC, which was further replicated by the TCGA samples (HRadjusted = 4.19, p = 0.015). Interestingly, the G allele of rs33922584 was significantly associated with high expression of CCT6A (p = 0.019) that might induce the worse survival in the TCGA samples (HRadjusted = 1.15, p = 0.047). Besides, rs117512489 in gene phospholipase B1 gene (PLB1) (HR = 2.02, p = 7.28 × 10-4) was also associated with survival of the patients with NSCLC in our samples, but it was supported only by gene expression analysis in the TCGA (HRadjusted = 1.15, p = 0.023). Gene-based and pathway-based analysis revealed a total of 32 genes, including CCT6A and 34 potential pathways might account for the survival of NSCLC, respectively. CONCLUSION These results provided more evidence for the important role of low-frequency or rare variants in the survival of patients with NSCLC.
Collapse
Affiliation(s)
- Meng Zhu
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Liguo Geng
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Wei Shen
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Yuzhuo Wang
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Jia Liu
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Yang Cheng
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Cheng Wang
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China
| | - Juncheng Dai
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China; Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University, Nanjing, People's Republic of China
| | - Guangfu Jin
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China; Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University, Nanjing, People's Republic of China
| | - Zhibin Hu
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China; Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University, Nanjing, People's Republic of China
| | - Hongxia Ma
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China; Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University, Nanjing, People's Republic of China.
| | - Hongbing Shen
- Department of Epidemiology and Biostatistics, Collaborative Innovation Center For Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, People's Republic of China; Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University, Nanjing, People's Republic of China
| |
Collapse
|
30
|
He Q, Cai T, Liu Y, Zhao N, Harmon QE, Almli LM, Binder EB, Engel SM, Ressler KJ, Conneely KN, Lin X, Wu MC. Prioritizing individual genetic variants after kernel machine testing using variable selection. Genet Epidemiol 2016; 40:722-731. [PMID: 27488097 DOI: 10.1002/gepi.21993] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 05/28/2016] [Accepted: 06/20/2016] [Indexed: 01/06/2023]
Abstract
Kernel machine learning methods, such as the SNP-set kernel association test (SKAT), have been widely used to test associations between traits and genetic polymorphisms. In contrast to traditional single-SNP analysis methods, these methods are designed to examine the joint effect of a set of related SNPs (such as a group of SNPs within a gene or a pathway) and are able to identify sets of SNPs that are associated with the trait of interest. However, as with many multi-SNP testing approaches, kernel machine testing can draw conclusion only at the SNP-set level, and does not directly inform on which one(s) of the identified SNP set is actually driving the associations. A recently proposed procedure, KerNel Iterative Feature Extraction (KNIFE), provides a general framework for incorporating variable selection into kernel machine methods. In this article, we focus on quantitative traits and relatively common SNPs, and adapt the KNIFE procedure to genetic association studies and propose an approach to identify driver SNPs after the application of SKAT to gene set analysis. Our approach accommodates several kernels that are widely used in SNP analysis, such as the linear kernel and the Identity by State (IBS) kernel. The proposed approach provides practically useful utilities to prioritize SNPs, and fills the gap between SNP set analysis and biological functional studies. Both simulation studies and real data application are used to demonstrate the proposed approach.
Collapse
Affiliation(s)
- Qianchuan He
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Yang Liu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Ni Zhao
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Quaker E Harmon
- Epidemiology Branch, NIEHS, Research Triangle Park, North Carolina, United States of America
| | - Lynn M Almli
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Elisabeth B Binder
- Department of Translational Research in Psychiatry, Max-Planck Institute of Psychiatry, Munich, Germany
| | - Stephanie M Engel
- Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Kerry J Ressler
- Division of Depression & Anxiety Disorders, McLean Hospital, Belmont, Massachusetts, United States of America
| | - Karen N Conneely
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Xihong Lin
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Michael C Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
31
|
Wang K. Boosting the Power of the Sequence Kernel Association Test by Properly Estimating Its Null Distribution. Am J Hum Genet 2016; 99:104-14. [PMID: 27292111 DOI: 10.1016/j.ajhg.2016.05.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 05/02/2016] [Indexed: 01/22/2023] Open
Abstract
The sequence kernel association test (SKAT) is probably the most popular statistical test used in rare-variant association studies. Its null distribution involves unknown parameters that need to be estimated. The current estimation method has a valid type I error rate, but the power is compromised given that all subjects are used for estimation. I have developed an estimation method that uses only control subjects. Named SKAT+, this method uses the same test statistic as SKAT but differs in the way the null distribution is estimated. Extensive simulation studies and applications to data from the Genetic Analysis Workshop 17 and the Ocular Hypertension Treatment Study demonstrated that SKAT+ has superior power over SKAT while maintaining control over the type I error rate. This method is applicable to extensions of SKAT in the literature.
Collapse
Affiliation(s)
- Kai Wang
- Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA 52242, USA.
| |
Collapse
|
32
|
Kong D, Giovanello KS, Wang Y, Lin W, Lee E, Fan Y, Murali Doraiswamy P, Zhu H. Predicting Alzheimer's Disease Using Combined Imaging-Whole Genome SNP Data. J Alzheimers Dis 2016; 46:695-702. [PMID: 25869783 DOI: 10.3233/jad-150164] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The growing public threat of Alzheimer's disease (AD) has raised the urgency to discover and validate prognostic biomarkers in order to predicting time to onset of AD. It is anticipated that both whole genome single nucleotide polymorphism (SNP) data and high dimensional whole brain imaging data offer predictive values to identify subjects at risk for progressing to AD. The aim of this paper is to test whether both whole genome SNP data and whole brain imaging data offer predictive values to identify subjects at risk for progressing to AD. In 343 subjects with mild cognitive impairment (MCI) enrolled in the Alzheimer's Disease Neuroimaging Initiative (ADNI-1), we extracted high dimensional MR imaging (volumetric data on 93 brain regions plus a surface fluid registration based hippocampal subregion and surface data), and whole genome data (504,095 SNPs from GWAS), as well as routine neurocognitive and clinical data at baseline. MCI patients were then followed over 48 months, with 150 participants progressing to AD. Combining information from whole brain MR imaging and whole genome data was substantially superior to the standard model for predicting time to onset of AD in a 48-month national study of subjects at risk. Our findings demonstrate the promise of combined imaging-whole genome prognostic markers in people with mild memory impairment.
Collapse
Affiliation(s)
- Dehan Kong
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Kelly S Giovanello
- Department of Psychology, University of North Carolina, Chapel Hill, NC, USA.,Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC, USA
| | - Yalin Wang
- School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Weili Lin
- Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC, USA.,Department of Radiology, University of North Carolina, Chapel Hill, NC, USA
| | - Eunjee Lee
- Department of Statistics, University of North Carolina, Chapel Hill, NC, USA
| | - Yong Fan
- Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - P Murali Doraiswamy
- Departments of Psychiatry and Duke Institute for Brain Sciences, Duke University, Durham, NC, USA
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.,Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC, USA.,Department of Radiology, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
33
|
Neykov M, Hejblum BP, Sinnott JA. Kernel machine score test for pathway analysis in the presence of semi-competing risks. Stat Methods Med Res 2016; 27:1099-1114. [PMID: 27255336 DOI: 10.1177/0962280216653427] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
In cancer studies, patients often experience two different types of events: a non-terminal event such as recurrence or metastasis, and a terminal event such as cancer-specific death. Identifying pathways and networks of genes associated with one or both of these events is an important step in understanding disease development and targeting new biological processes for potential intervention. These correlated outcomes are commonly dealt with by modeling progression-free survival, where the event time is the minimum between the times of recurrence and death. However, identifying pathways only associated with progression-free survival may miss out on pathways that affect time to recurrence but not death, or vice versa. We propose a combined testing procedure for a pathway's association with both the cause-specific hazard of recurrence and the marginal hazard of death. The dependency between the two outcomes is accounted for through perturbation resampling to approximate the test's null distribution, without any further assumption on the nature of the dependency. Even complex non-linear relationships between pathways and disease progression or death can be uncovered thanks to a flexible kernel machine framework. The superior statistical power of our approach is demonstrated in numerical studies and in a gene expression study of breast cancer.
Collapse
Affiliation(s)
- Matey Neykov
- 1 Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ, USA
| | - Boris P Hejblum
- 2 Department of Biostatistics, Harvard University, Boston, MA, USA
| | | |
Collapse
|
34
|
Wu B, Pankow JS. On Sample Size and Power Calculation for Variant Set-Based Association Tests. Ann Hum Genet 2016; 80:136-43. [PMID: 26831402 PMCID: PMC4761288 DOI: 10.1111/ahg.12147] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/07/2015] [Indexed: 01/03/2023]
Abstract
Sample size and power calculations are an important part of designing new sequence-based association studies. The recently developed SEQPower and SPS programs adopted computationally intensive Monte Carlo simulations to empirically estimate power for a series of variant set association (VSA) test methods including the sequence kernel association test (SKAT). It is desirable to develop methods that can quickly and accurately compute power without intensive Monte Carlo simulations. We will show that the computed power for SKAT based on the existing analytical approach could be inflated especially for small significance levels, which are often of primary interest for large-scale whole genome and exome sequencing projects. We propose a new χ(2) -approximation-based approach to accurately and efficiently compute sample size and power. In addition, we propose and implement a more accurate "exact" method to compute power, which is more efficient than the Monte Carlo approach though generally involves more computations than the χ(2) approximation method. The exact approach could produce very accurate results and be used to verify alternative approximation approaches. We implement the proposed methods in publicly available R programs that can be readily adapted when planning sequencing projects.
Collapse
Affiliation(s)
- Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - James S. Pankow
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
35
|
Fan R, Wang Y, Yan Q, Ding Y, Weeks DE, Lu Z, Ren H, Cook RJ, Xiong M, Swaroop A, Chew EY, Chen W. Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions. Genet Epidemiol 2016; 40:133-43. [PMID: 26782979 DOI: 10.1002/gepi.21947] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Revised: 10/13/2015] [Accepted: 11/05/2015] [Indexed: 11/07/2022]
Abstract
Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example.
Collapse
Affiliation(s)
- Ruzong Fan
- Division of Intramural Population Health Research, Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Yifan Wang
- Division of Intramural Population Health Research, Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Qi Yan
- Division of Pulmonary Medicine, Allergy and Immunology, Children's Hospital of Pittsburgh at The University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Ying Ding
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Daniel E Weeks
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Zhaohui Lu
- Division of Intramural Population Health Research, Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (NIH), Bethesda, Maryland, United States of America
| | - Haobo Ren
- Regeneron Pharmaceuticals, Inc, Basking Ridge, New Jersey, United States of America
| | - Richard J Cook
- Department of Statistics and Actuarial Science, Waterloo, ON, Canada
| | - Momiao Xiong
- Human Genetics Center, University of Texas, Houston, Texas, United States of America
| | - Anand Swaroop
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, NIH, Bethesda, Maryland, United States of America
| | - Emily Y Chew
- Division of Epidemiology and Clinical Applications, National Eye Institute, NIH, Bethesda, Maryland, United States of America
| | - Wei Chen
- Division of Pulmonary Medicine, Allergy and Immunology, Children's Hospital of Pittsburgh at The University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
36
|
Binder M, Shui IM, Wilson KM, Penney KL, Mucci LA, Kibel AS. Calcium intake, polymorphisms of the calcium-sensing receptor, and recurrent/aggressive prostate cancer. Cancer Causes Control 2015; 26:1751-9. [PMID: 26407952 PMCID: PMC4633306 DOI: 10.1007/s10552-015-0668-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 09/11/2015] [Indexed: 11/28/2022]
Abstract
PURPOSE To assess whether calcium intake and common genetic variants of the calcium-sensing receptor (CASR) are associated with either aggressive prostate cancer (PCa) or disease recurrence after prostatectomy. METHODS Calcium intake at diagnosis was assessed, and 65 common single-nucleotide polymorphisms (SNPs) in CASR were genotyped in 886 prostatectomy patients. We investigated the association between calcium intake and CASR variants with both PCa recurrence and aggressiveness (defined as Gleason score ≥4 + 3, stage ≥pT3, or nodal-positive disease). RESULTS A total of 285 men had aggressive disease and 91 experienced recurrence. A U-shaped relationship between calcium intake and both disease recurrence and aggressiveness was observed. Compared to the middle quintile, the HR for disease recurrence was 3.07 (95% CI 1.41-6.69) for the lowest quintile and 3.21 (95% CI 1.47-7.00) and 2.97 (95% CI 1.37-6.45) for the two upper quintiles, respectively. Compared to the middle quintile, the OR for aggressive disease was 1.80 (95% CI 1.11-2.91) for the lowest quintile and 1.75 (95% CI 1.08-2.85) for the highest quintile of calcium intake. The main effects of CASR variants were not associated with PCa recurrence or aggressiveness. In the subgroup of patients with moderate calcium intake, 31 SNPs in four distinct blocks of high linkage disequilibrium were associated with PCa recurrence. CONCLUSIONS We observed a protective effect of moderate calcium intake for PCa aggressiveness and recurrence. While CASR variants were not associated with these outcomes in the entire cohort, they may be associated with disease recurrence in men with moderate calcium intakes.
Collapse
Affiliation(s)
- Moritz Binder
- Master of Public Health Program, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| | - Irene M Shui
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| | - Kathryn M Wilson
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
- Channing Division of Network Medicine, Brigham & Women's Hospital and Harvard Medical School, 181 Longwood Avenue, Boston, MA, 02215, USA
| | - Kathryn L Penney
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
- Channing Division of Network Medicine, Brigham & Women's Hospital and Harvard Medical School, 181 Longwood Avenue, Boston, MA, 02215, USA
| | - Lorelei A Mucci
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
- Channing Division of Network Medicine, Brigham & Women's Hospital and Harvard Medical School, 181 Longwood Avenue, Boston, MA, 02215, USA
| | - Adam S Kibel
- Division of Urologic Surgery, Brigham and Women's Hospital, 45 Francis Street, Boston, MA, 02115, USA.
| |
Collapse
|
37
|
Leclerc M, Simard J, Lakhal-Chaieb L. SNP Set Association Testing for Survival Outcomes in the Presence of Intrafamilial Correlation. Genet Epidemiol 2015; 39:406-14. [PMID: 26282997 DOI: 10.1002/gepi.21914] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Revised: 06/04/2015] [Accepted: 06/17/2015] [Indexed: 11/06/2022]
Abstract
In this work, we propose a single nucleotide polymorphism (SNP) set association test for censored phenotypes in the presence of a family-based design. The proposed test is valid for both common and rare variants. A proportional hazards Cox model is specified for the marginal distribution of the trait and the familial dependence is modeled via a Gaussian copula. Censored values are treated as partially missing data and a multiple imputation procedure is proposed in order to compute the test statistics. The P-value is then deduced analytically. The finite-sample empirical properties of the proposed method are evaluated and compared to existing competitors by simulations and its use is illustrated using a breast cancer data set from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2.
Collapse
Affiliation(s)
- Martin Leclerc
- Département de mathématiques et de statistique, Université Laval, Québec, Canada
| | | | - Jacques Simard
- Department of Molecular Medicine, Canada Research Chair in Oncogenetics, Laval University & Genomics Centre, CHU de Québec Research Centre, Québec, Canada
| | - Lajmi Lakhal-Chaieb
- Département de mathématiques et de statistique, Université Laval, Québec, Canada
| |
Collapse
|
38
|
Marceau R, Lu W, Holloway S, Sale MM, Worrall BB, Williams SR, Hsu FC, Tzeng JY. A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction. Genet Epidemiol 2015; 39:456-68. [PMID: 26139508 DOI: 10.1002/gepi.21909] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Revised: 05/10/2015] [Accepted: 05/20/2015] [Indexed: 01/27/2023]
Abstract
Kernel machine (KM) models are a powerful tool for exploring associations between sets of genetic variants and complex traits. Although most KM methods use a single kernel function to assess the marginal effect of a variable set, KM analyses involving multiple kernels have become increasingly popular. Multikernel analysis allows researchers to study more complex problems, such as assessing gene-gene or gene-environment interactions, incorporating variance-component based methods for population substructure into rare-variant association testing, and assessing the conditional effects of a variable set adjusting for other variable sets. The KM framework is robust, powerful, and provides efficient dimension reduction for multifactor analyses, but requires the estimation of high dimensional nuisance parameters. Traditional estimation techniques, including regularization and the "expectation-maximization (EM)" algorithm, have a large computational cost and are not scalable to large sample sizes needed for rare variant analysis. Therefore, under the context of gene-environment interaction, we propose a computationally efficient and statistically rigorous "fastKM" algorithm for multikernel analysis that is based on a low-rank approximation to the nuisance effect kernel matrices. Our algorithm is applicable to various trait types (e.g., continuous, binary, and survival traits) and can be implemented using any existing single-kernel analysis software. Through extensive simulation studies, we show that our algorithm has similar performance to an EM-based KM approach for quantitative traits while running much faster. We also apply our method to the Vitamin Intervention for Stroke Prevention (VISP) clinical trial, examining gene-by-vitamin effects on recurrent stroke risk and gene-by-age effects on change in homocysteine level.
Collapse
Affiliation(s)
- Rachel Marceau
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Shannon Holloway
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Michèle M Sale
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America.,Department of Medicine, University of Virginia, Charlottesville, Virginia, United States of America.,Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia, United States of America
| | - Bradford B Worrall
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America.,Department of Neurology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Stephen R Williams
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America.,Cardiovascular Research Center, University of Virginia, Charlottesville, Virginia, United States of America
| | - Fang-Chi Hsu
- Department of Biostatistical Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America.,Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America.,Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
| |
Collapse
|
39
|
Shui IM, Mondul AM, Lindström S, Tsilidis KK, Travis RC, Gerke T, Albanes D, Mucci LA, Giovannucci E, Kraft P. Circulating vitamin D, vitamin D-related genetic variation, and risk of fatal prostate cancer in the National Cancer Institute Breast and Prostate Cancer Cohort Consortium. Cancer 2015; 121:1949-56. [PMID: 25731953 PMCID: PMC4457645 DOI: 10.1002/cncr.29320] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Revised: 01/21/2015] [Accepted: 01/26/2015] [Indexed: 01/08/2023]
Abstract
BACKGROUND Evidence from experimental animal and cell line studies supports a beneficial role for vitamin D in prostate cancer (PCa). Although the results from human studies have been mainly null for overall PCa risk, there may be a benefit for survival. This study assessed the associations of circulating 25-hydroxyvitamin D (25(OH)D) and common variations in key vitamin D-related genes with fatal PCa. METHODS In a large cohort consortium, 518 fatal cases and 2986 controls with 25(OH)D data were identified. Genotyping information for 91 single-nucleotide polymorphisms (SNPs) in 7 vitamin D-related genes (vitamin D receptor, group-specific component, cytochrome P450 27A1 [CYP27A1], CYP27B1, CYP24A1, CYP2R1, and retinoid X receptor α) was available for 496 fatal cases and 3577 controls. Unconditional logistic regression was used to calculate odds ratios (ORs) and 95% confidence intervals (CIs) for the associations of 25(OH)D and SNPs with fatal PCa. The study also tested for 25(OH)D-SNP interactions among 264 fatal cases and 1169 controls. RESULTS No statistically significant relationship was observed between 25(OH)D and fatal PCa (OR for extreme quartiles, 0.86; 95% CI, 0.65-1.14; P for trend = .22) or the main effects of the SNPs and fatal PCa. There was evidence suggesting that associations of several SNPs, including 5 related to circulating 25(OH)D, with fatal PCa were modified by 25(OH)D. Individually, these associations did not remain significant after multiple testing; however, the P value for the set-based test for CYP2R1 was .002. CONCLUSIONS Statistically significant associations were not observed for either 25(OH)D or vitamin D-related SNPs with fatal PCa. The effect modification of 25(OH)D associations by biologically plausible genetic variation may deserve further exploration.
Collapse
Affiliation(s)
- Irene M Shui
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Alison M Mondul
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
- Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, Michigan
| | - Sara Lindström
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| | - Konstantinos K Tsilidis
- Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
- Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
| | - Ruth C Travis
- Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
| | - Travis Gerke
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
- Department of Epidemiology, University of Florida, Gainesville, Florida
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Lorelei A Mucci
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Edward Giovannucci
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts
| | - Peter Kraft
- Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| |
Collapse
|
40
|
Markt SC, Valdimarsdottir UA, Shui IM, Sigurdardottir LG, Rider JR, Tamimi RM, Batista JL, Haneuse S, Flynn-Evans E, Lockley SW, Czeisler CA, Stampfer MJ, Launer L, Harris T, Smith AV, Gudnason V, Lindstrom S, Kraft P, Mucci LA. Circadian clock genes and risk of fatal prostate cancer. Cancer Causes Control 2015; 26:25-33. [PMID: 25388799 PMCID: PMC4282953 DOI: 10.1007/s10552-014-0478-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Accepted: 10/09/2014] [Indexed: 01/20/2023]
Abstract
PURPOSE Circadian genes may be involved in regulating cancer-related pathways, including cell proliferation, DNA damage response, and apoptosis. We aimed to assess the role of genetic variation in core circadian rhythm genes with the risk of fatal prostate cancer and first morning void urinary 6-sulfatoxymelatonin levels. METHODS We used unconditional logistic regression to evaluate the association of 96 single-nucleotide polymorphisms (SNPs) across 12 circadian-related genes with fatal prostate cancer in the AGES-Reykjavik cohort (n = 24 cases), the Health Professionals Follow-Up Study (HPFS) (n = 40 cases), and the Physicians' Health Study (PHS) (n = 105 cases). We used linear regression to evaluate the association between SNPs and first morning void urinary 6-sulfatoxymelatonin levels in AGES-Reykjavik. We used a kernel machine test to evaluate whether multimarker SNP sets in the pathway (gene based) were associated with our outcomes. RESULTS None of the individual SNPs were consistently associated with fatal prostate cancer across the three cohorts. In each cohort, gene-based analyses showed that variation in the CRY1 gene was nominally associated with fatal prostate cancer (p values = 0.01, 0.01, and 0.05 for AGES-Reykjavik, HPFS, and PHS, respectively). In AGES-Reykjavik, SNPs in TIMELESS (four SNPs), NPAS2 (six SNPs), PER3 (two SNPs) and CSNK1E (one SNP) were nominally associated with 6-sulfatoxymelatonin levels. CONCLUSION We did not find a strong and consistent association between variation in core circadian clock genes and fatal prostate cancer risk, but observed nominally significant gene-based associations with fatal prostate cancer and 6-sulfatoxymelatonin levels.
Collapse
Affiliation(s)
- Sarah C Markt
- Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA, 02115-6018, USA,
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Zhao N, Bell DA, Maity A, Staicu AM, Joubert BR, London SJ, Wu MC. Global analysis of methylation profiles from high resolution CpG data. Genet Epidemiol 2014; 39:53-64. [PMID: 25537884 DOI: 10.1002/gepi.21874] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 10/09/2014] [Accepted: 10/31/2014] [Indexed: 12/22/2022]
Abstract
New high throughput technologies are now enabling simultaneous epigenetic profiling of DNA methylation at hundreds of thousands of CpGs across the genome. A problem of considerable practical interest is identification of large scale, global changes in methylation that are associated with environmental variables, clinical outcomes, or other experimental conditions. However, there has been little statistical research on methods for global methylation analysis using technologies with individual CpG resolution. To address this critical gap in the literature, we develop a new strategy for global analysis of methylation profiles using a functional regression approach wherein we approximate either the density or the cumulative distribution function (CDF) of the methylation values for each individual using B-spline basis functions. The spline coefficients for each individual are allowed to summarize the individual's overall methylation profile. We then test for association between the overall distribution and a continuous or dichotomous outcome variable using a variance component score test that naturally accommodates the correlation between spline coefficients. Simulations indicate that our proposed approach has desirable power while protecting type I error. The method was applied to detect methylation differences, both genome wide and at LINE1 elements, between the blood samples from rheumatoid arthritis patients and healthy controls and to detect the epigenetic changes of human hepatocarcinogenesis in the context of alcohol abuse and hepatitis C virus infection. A free implementation of our methods in the R language is available in the Global Analysis of Methylation Profiles (GAMP) package at http://research.fhcrc.org/wu/en.html.
Collapse
Affiliation(s)
- Ni Zhao
- Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | | | | | | | | | | | | |
Collapse
|
42
|
Li M, Gardiner JC, Breslau N, Anthony JC, Lu Q. A non-parametric approach for detecting gene-gene interactions associated with age-at-onset outcomes. BMC Genet 2014; 15:79. [PMID: 24986733 PMCID: PMC4087128 DOI: 10.1186/1471-2156-15-79] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 06/13/2014] [Indexed: 12/03/2022] Open
Abstract
Background Cox-regression-based methods have been commonly used for the analyses of survival outcomes, such as age-at-disease-onset. These methods generally assume the hazard functions are proportional among various risk groups. However, such an assumption may not be valid in genetic association studies, especially when complex interactions are involved. In addition, genetic association studies commonly adopt case-control designs. Direct use of Cox regression to case-control data may yield biased estimators and incorrect statistical inference. Results We propose a non-parametric approach, the weighted Nelson-Aalen (WNA) approach, for detecting genetic variants that are associated with age-dependent outcomes. The proposed approach can be directly applied to prospective cohort studies, and can be easily extended for population-based case-control studies. Moreover, it does not rely on any assumptions of the disease inheritance models, and is able to capture high-order gene-gene interactions. Through simulations, we show the proposed approach outperforms Cox-regression-based methods in various scenarios. We also conduct an empirical study of progression of nicotine dependence by applying the WNA approach to three independent datasets from the Study of Addiction: Genetics and Environment. In the initial dataset, two SNPs, rs6570989 and rs2930357, located in genes GRIK2 and CSMD1, are found to be significantly associated with the progression of nicotine dependence (ND). The joint association is further replicated in two independent datasets. Further analysis suggests that these two genes may interact and be associated with the progression of ND. Conclusions As demonstrated by the simulation studies and real data analysis, the proposed approach provides an efficient tool for detecting genetic interactions associated with age-at-onset outcomes.
Collapse
Affiliation(s)
| | | | | | | | - Qing Lu
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
43
|
Tzeng JY, Lu W, Hsu FC. GENE-LEVEL PHARMACOGENETIC ANALYSIS ON SURVIVAL OUTCOMES USING GENE-TRAIT SIMILARITY REGRESSION. Ann Appl Stat 2014; 8:1232-1255. [PMID: 25018788 DOI: 10.1214/14-aoas735] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance-component based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy.
Collapse
Affiliation(s)
- Jung-Ying Tzeng
- North Carolina State University ; National Cheng-Kung University
| | | | | |
Collapse
|
44
|
Shui IM, Lindström S, Kibel AS, Berndt SI, Campa D, Gerke T, Penney KL, Albanes D, Berg C, Bueno-de-Mesquita HB, Chanock S, Crawford ED, Diver WR, Gapstur SM, Gaziano JM, Giles GG, Henderson B, Hoover R, Johansson M, Le Marchand L, Ma J, Navarro C, Overvad K, Schumacher FR, Severi G, Siddiq A, Stampfer M, Stevens VL, Travis RC, Trichopoulos D, Vineis P, Mucci LA, Yeager M, Giovannucci E, Kraft P. Prostate cancer (PCa) risk variants and risk of fatal PCa in the National Cancer Institute Breast and Prostate Cancer Cohort Consortium. Eur Urol 2014; 65:1069-75. [PMID: 24411283 PMCID: PMC4006298 DOI: 10.1016/j.eururo.2013.12.058] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 12/23/2013] [Indexed: 12/21/2022]
Abstract
BACKGROUND Screening and diagnosis of prostate cancer (PCa) is hampered by an inability to predict who has the potential to develop fatal disease and who has indolent cancer. Studies have identified multiple genetic risk loci for PCa incidence, but it is unknown whether they could be used as biomarkers for PCa-specific mortality (PCSM). OBJECTIVE To examine the association of 47 established PCa risk single-nucleotide polymorphisms (SNPs) with PCSM. DESIGN, SETTING, AND PARTICIPANTS We included 10 487 men who had PCa and 11 024 controls, with a median follow-up of 8.3 yr, during which 1053 PCa deaths occurred. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS The main outcome was PCSM. The risk allele was defined as the allele associated with an increased risk for PCa in the literature. We used Cox proportional hazards regression to calculate the hazard ratios of each SNP with time to progression to PCSM after diagnosis. We also used logistic regression to calculate odds ratios for each risk SNP, comparing fatal PCa cases to controls. RESULTS AND LIMITATIONS Among the cases, we found that 8 of the 47 SNPs were significantly associated (p<0.05) with time to PCSM. The risk allele of rs11672691 (intergenic) was associated with an increased risk for PCSM, while 7 SNPs had risk alleles inversely associated (rs13385191 [C2orf43], rs17021918 [PDLIM5], rs10486567 [JAZF1], rs6465657 [LMTK2], rs7127900 (intergenic), rs2735839 [KLK3], rs10993994 [MSMB], rs13385191 [C2orf43]). In the case-control analysis, 22 SNPs were associated (p<0.05) with the risk of fatal PCa, but most did not differentiate between fatal and nonfatal PCa. Rs11672691 and rs10993994 were associated with both fatal and nonfatal PCa, while rs6465657, rs7127900, rs2735839, and rs13385191 were associated with nonfatal PCa only. CONCLUSIONS Eight established risk loci were associated with progression to PCSM after diagnosis. Twenty-two SNPs were associated with fatal PCa incidence, but most did not differentiate between fatal and nonfatal PCa. The relatively small magnitudes of the associations do not translate well into risk prediction, but these findings merit further follow-up, because they may yield important clues about the complex biology of fatal PCa. PATIENT SUMMARY In this report, we assessed whether established PCa risk variants could predict PCSM. We found eight risk variants associated with PCSM: One predicted an increased risk of PCSM, while seven were associated with decreased risk. Larger studies that focus on fatal PCa are needed to identify more markers that could aid prediction.
Collapse
Affiliation(s)
- Irene M Shui
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA.
| | - Sara Lindström
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| | - Adam S Kibel
- Department of Surgery, Division of Urology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Sonja I Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Daniele Campa
- Genomic Epidemiology Group, German Cancer Research Center (Deutsches Krebsforschungszentrum), Heidelberg, Germany
| | - Travis Gerke
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| | - Kathryn L Penney
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Christine Berg
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins Medicine, Baltimore, MD, USA
| | - H Bas Bueno-de-Mesquita
- National Institute for Public Health and the Environment, Bilthoven, The Netherlands; Department of Gastroenterology and Hepatology, University Medical Centre, Utrecht, The Netherlands; School of Public Health, Imperial College London, London, United Kingdom
| | - Stephen Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA; Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Gaithersburg, MD, USA
| | | | - W Ryan Diver
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - Susan M Gapstur
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - J Michael Gaziano
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA; Division of Aging, Brigham and Women's Hospital, Boston, MA, USA
| | - Graham G Giles
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia; Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Australia
| | - Brian Henderson
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Robert Hoover
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Mattias Johansson
- International Agency for Research on Cancer, Lyon, France; Department of Biobank Research, Umeå University, Umeå, Sweden
| | - Loic Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Jing Ma
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Carmen Navarro
- Department of Epidemiology, Murcia Regional Health Authority, Murcia, Spain; Department of Health and Social Sciences, Universidad de Murcia, Murcia, Spain
| | - Kim Overvad
- Department of Public Health, Aarhus University, Aarhus, Denmark
| | - Fredrick R Schumacher
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Gianluca Severi
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia; HuGeF Foundation, Torino, Italy
| | - Afshan Siddiq
- Department of Genomics of Common Disease, Imperial College London, London, United Kingdom
| | - Meir Stampfer
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Victoria L Stevens
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - Ruth C Travis
- Cancer Epidemiology Unit, University of Oxford, Oxford, United Kingdom
| | - Dimitrios Trichopoulos
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Bureau of Epidemiologic Research, Academy of Athens, Athens, Greece; Hellenic Health Foundation, Athens, Greece
| | - Paolo Vineis
- HuGeF Foundation, Torino, Italy; School of Public Health, Imperial College London, London, United Kingdom
| | - Lorelei A Mucci
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| | - Meredith Yeager
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA; Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Gaithersburg, MD, USA
| | - Edward Giovannucci
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| | - Peter Kraft
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| |
Collapse
|
45
|
Cao Y, Lindström S, Schumacher F, Stevens VL, Albanes D, Berndt S, Boeing H, Bueno-de-Mesquita HB, Canzian F, Chamosa S, Chanock SJ, Diver WR, Gapstur SM, Gaziano JM, Giovannucci EL, Haiman CA, Henderson B, Johansson M, Le Marchand L, Palli D, Rosner B, Siddiq A, Stampfer M, Stram DO, Tamimi R, Travis RC, Trichopoulos D, Willett WC, Yeager M, Kraft P, Hsing AW, Pollak M, Lin X, Ma J. Insulin-like growth factor pathway genetic polymorphisms, circulating IGF1 and IGFBP3, and prostate cancer survival. J Natl Cancer Inst 2014; 106:dju085. [PMID: 24824313 PMCID: PMC4081624 DOI: 10.1093/jnci/dju085] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Revised: 03/03/2014] [Accepted: 03/04/2014] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The insulin-like growth factor (IGF) signaling pathway has been implicated in prostate cancer (PCa) initiation, but its role in progression remains unknown. METHODS Among 5887 PCa patients (704 PCa deaths) of European ancestry from seven cohorts in the National Cancer Institute Breast and Prostate Cancer Cohort Consortium, we conducted Cox kernel machine pathway analysis to evaluate whether 530 tagging single nucleotide polymorphisms (SNPs) in 26 IGF pathway-related genes were collectively associated with PCa mortality. We also conducted SNP-specific analysis using stratified Cox models adjusting for multiple testing. In 2424 patients (313 PCa deaths), we evaluated the association of prediagnostic circulating IGF1 and IGFBP3 levels and PCa mortality. All statistical tests were two-sided. RESULTS The IGF signaling pathway was associated with PCa mortality (P = .03), and IGF2-AS and SSTR2 were the main contributors (both P = .04). In SNP-specific analysis, 36 SNPs were associated with PCa mortality with P trend less than .05, but only three SNPs in the IGF2-AS remained statistically significant after gene-based corrections. Two were in linkage disequilibrium (r 2 = 1 for rs1004446 and rs3741211), whereas the third, rs4366464, was independent (r 2 = 0.03). The hazard ratios (HRs) per each additional risk allele were 1.19 (95% confidence interval [CI] = 1.06 to 1.34; P trend = .003) for rs3741211 and 1.44 (95% CI = 1.20 to 1.73; P trend < .001) for rs4366464. rs4366464 remained statistically significant after correction for all SNPs (P trend.corr = .04). Prediagnostic IGF1 (HRhighest vs lowest quartile = 0.71; 95% CI = 0.48 to 1.04) and IGFBP3 (HR = 0.93; 95% CI = 0.65 to 1.34) levels were not associated with PCa mortality. CONCLUSIONS The IGF signaling pathway, primarily IGF2-AS and SSTR2 genes, may be important in PCa survival.
Collapse
|
46
|
Insulin-like Growth Factor Pathway Genetic Polymorphisms, Circulating IGF1 and IGFBP3, and Prostate Cancer Survival. J Natl Cancer Inst 2014; 106:dju218. [PMCID: PMC4111284 DOI: 10.1093/jnci/dju218] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Revised: 03/03/2014] [Accepted: 03/04/2014] [Indexed: 04/11/2024] Open
Abstract
Background The insulin-like growth factor (IGF) signaling pathway has been implicated in prostate cancer (PCa) initiation, but its role in progression remains unknown. Methods Among 5887 PCa patients (704 PCa deaths) of European ancestry from seven cohorts in the National Cancer Institute Breast and Prostate Cancer Cohort Consortium, we conducted Cox kernel machine pathway analysis to evaluate whether 530 tagging single nucleotide polymorphisms (SNPs) in 26 IGF pathway-related genes were collectively associated with PCa mortality. We also conducted SNP-specific analysis using stratified Cox models adjusting for multiple testing. In 2424 patients (313 PCa deaths), we evaluated the association of prediagnostic circulating IGF1 and IGFBP3 levels and PCa mortality. All statistical tests were two-sided. Results The IGF signaling pathway was associated with PCa mortality (P = .03), and IGF2-AS and SSTR2 were the main contributors (both P = .04). In SNP-specific analysis, 36 SNPs were associated with PCa mortality with P trend less than .05, but only three SNPs in the IGF2-AS remained statistically significant after gene-based corrections. Two were in linkage disequilibrium (r 2 = 1 for rs1004446 and rs3741211), whereas the third, rs4366464, was independent (r 2 = 0.03). The hazard ratios (HRs) per each additional risk allele were 1.19 (95% confidence interval [CI] = 1.06 to 1.34; P trend = .003) for rs3741211 and 1.44 (95% CI = 1.20 to 1.73; P trend < .001) for rs4366464. rs4366464 remained statistically significant after correction for all SNPs (P trend.corr = .04). Prediagnostic IGF1 (HRhighest vs lowest quartile = 0.71; 95% CI = 0.48 to 1.04) and IGFBP3 (HR = 0.93; 95% CI = 0.65 to 1.34) levels were not associated with PCa mortality. Conclusions The IGF signaling pathway, primarily IGF2-AS and SSTR2 genes, may be important in PCa survival.
Collapse
|
47
|
Chen H, Lumley T, Brody J, Heard-Costa NL, Fox CS, Cupples LA, Dupuis J. Sequence kernel association test for survival traits. Genet Epidemiol 2014; 38:191-7. [PMID: 24464521 DOI: 10.1002/gepi.21791] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Revised: 12/20/2013] [Accepted: 12/21/2013] [Indexed: 11/11/2022]
Abstract
Rare variant tests have been of great interest in testing genetic associations with diseases and disease-related quantitative traits in recent years. Among these tests, the sequence kernel association test (SKAT) is an omnibus test for effects of rare genetic variants, in a linear or logistic regression framework. It is often described as a variance component test treating the genotypic effects as random. When the linear kernel is used, its test statistic can be expressed as a weighted sum of single-marker score test statistics. In this paper, we extend the test to survival phenotypes in a Cox regression framework. Because of the anticonservative small-sample performance of the score test in a Cox model, we substitute signed square-root likelihood ratio statistics for the score statistics, and confirm that the small-sample control of type I error is greatly improved. This test can also be applied in meta-analysis. We show in our simulation studies that this test has superior statistical power except in a few specific scenarios, as compared to burden tests in a Cox model. We also present results in an application to time-to-obesity using genotypes from Framingham Heart Study SNP Health Association Resource.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America; Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | | | | | | | | | | | | |
Collapse
|
48
|
Kaklamani VG, Hoffmann TJ, Thornton TA, Hayes G, Chlebowski R, Van Horn L, Mantzoros C. Adiponectin pathway polymorphisms and risk of breast cancer in African Americans and Hispanics in the Women's Health Initiative. Breast Cancer Res Treat 2013; 139:461-8. [PMID: 23624817 PMCID: PMC3773607 DOI: 10.1007/s10549-013-2546-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 04/22/2013] [Indexed: 12/29/2022]
Abstract
Adiponectin, a protein secreted by the adipose tissue, is an endogenous insulin sensitizer with circulating levels that are decreased in obese and diabetic subjects. Recently, circulating levels of adiponectin have been correlated with breast cancer risk. Our previous work showed that polymorphisms of the adiponectin pathway are associated with breast cancer risk. We conducted the first study of adiponectin pathways in African Americans and Hispanics in the Women's Health Initiative SNP Health Association Resource cohort of 3,642 self-identified Hispanic women and 8,515 self-identified African American women who provided consent for DNA analysis. Single nucleotide polymorphisms (SNPs) from three genes were included in this analysis: ADIPOQ, ADIPOR1, and ADIPOR2. The genome-wide human SNP array 6.0 (909,622 SNPs) ( www.affymetrix.com ) was used. We found that rs1501299, a functional SNP of ADIPOQ that we previously reported was associated with breast cancer risk in a mostly Caucasian population, was also significantly associated with breast cancer incidence (HR for the GG/TG genotype: 1.23; 95 % CI 1.059-1.43) in African American women. We did not find any other SNPs in these genes to be associated with breast cancer incidence. This is the first study assessing the role of adiponectin pathway SNPs in breast cancer risk in African Americans and Hispanics. RS1501299 is significantly associated with breast cancer risk in African American women. As the rates of obesity and diabetes increase in African Americans and Hispanics, adiponectin and its functional SNPs may aid in breast cancer risk assessment.
Collapse
Affiliation(s)
- Virginia G Kaklamani
- Division Hematology/Oncology, Department of Medicine, Northwestern University, Chicago, IL 60611, USA.
| | | | | | | | | | | | | |
Collapse
|
49
|
Wu MC, Maity A, Lee S, Simmons EM, Harmon QE, Lin X, Engel SM, Molldrem JJ, Armistead PM. Kernel machine SNP-set testing under multiple candidate kernels. Genet Epidemiol 2013; 37:267-75. [PMID: 23471868 PMCID: PMC3769109 DOI: 10.1002/gepi.21715] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2012] [Revised: 01/15/2013] [Accepted: 02/05/2013] [Indexed: 11/10/2022]
Abstract
Joint testing for the cumulative effect of multiple single-nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large-scale genetic association studies. The kernel machine (KM)-testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori because this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest P-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power vs. using the best candidate kernel.
Collapse
Affiliation(s)
- Michael C Wu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7420, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Marshall AL, Christiani DC. Genetic susceptibility to lung cancer--light at the end of the tunnel? Carcinogenesis 2013; 34:487-502. [PMID: 23349013 DOI: 10.1093/carcin/bgt016] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Lung cancer is one of the most common and deadliest cancers in the world. The major socio-environmental risk factor involved in the development of lung cancer is cigarette smoking. Additionally, there are multiple genetic factors, which may also play a role in lung cancer risk. Early work focused on the presence of relatively prevalent but low-penetrance alterations in candidate genes leading to increased risk of lung cancer. Development of new technologies such as genomic profiling and genome-wide association studies has been helpful in the detection of new genetic variants likely involved in lung cancer risk. In this review, we discuss the role of multiple genetic variants and review their putative role in the risk of lung cancer. Identifying genetic biomarkers and patterns of genetic risk may be useful in the earlier detection and treatment of lung cancer patients.
Collapse
Affiliation(s)
- Ariela L Marshall
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
| | | |
Collapse
|