1
|
Liu L, Ren D, Li K, Ji L, Feng M, Li Z, Meng L, He G, Shi Y. Unraveling schizophrenia's genetic complexity through advanced causal inference and chromatin 3D conformation. Schizophr Res 2024; 270:476-485. [PMID: 38996525 DOI: 10.1016/j.schres.2024.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 07/01/2024] [Accepted: 07/03/2024] [Indexed: 07/14/2024]
Abstract
Schizophrenia is a polygenic complex disease with a heritability as high as 80 %, yet the mechanism of polygenic interaction in its pathogenesis remains unclear. Studying the interaction and regulation of schizophrenia susceptibility genes is crucial for unraveling the pathogenesis of schizophrenia and developing antipsychotic drugs. Therefore, we developed a bioinformatics method named GRACI (Gene Regulation Analysis based on Causal Inference) based on the principles of information theory, a causal inference model, and high order chromatin 3D conformation. GRACI captures the interaction and regulatory relationships between schizophrenia susceptibility genes by analyzing genotyping data. Two datasets, comprising 1459 and 2065 samples respectively, were analyzed, and the gene networks from both datasets were constructed. GRACI showcased superior accuracy when compared to widely adopted methods for detecting gene-gene interactions and intergenic regulation. This alignment was further substantiated by its correlation with chromatin high-order conformation patterns. Using GRACI, we identified three potential genes-KCNN3, KCNH1, and KCND3-that are directly associated with schizophrenia pathogenesis. Furthermore, the results of GRACI on the standalone dataset illustrated the method's applicability to other complex diseases. GRACI download: https://github.com/liuliangjie19/GRACI.
Collapse
Affiliation(s)
- Liangjie Liu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Decheng Ren
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Keyi Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Lei Ji
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Mofan Feng
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Zhuoheng Li
- Department of Electrical Engineering and Computer Science, University of Michigan, 1301 Beal Avenue, Ann Arbor, MI 48109, USA
| | - Luming Meng
- Key Laboratory for Biobased Materials and Energy of Ministry of Education, College of Materials and Energy, South China Agricultural University, Guangzhou 510630, China
| | - Guang He
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China
| | - Yi Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Shanghai Key Laboratory of Psychotic Disorders, and Brain Science and Technology Research Center, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China; Research Institute for Doping Control, Shanghai University of Sport, Shanghai 200438, China.
| |
Collapse
|
2
|
Wang D, Perera D, He J, Cao C, Kossinna P, Li Q, Zhang W, Guo X, Platt A, Wu J, Zhang Q. cLD: Rare-variant linkage disequilibrium between genomic regions identifies novel genomic interactions. PLoS Genet 2023; 19:e1011074. [PMID: 38109434 PMCID: PMC10758262 DOI: 10.1371/journal.pgen.1011074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 01/01/2024] [Accepted: 11/20/2023] [Indexed: 12/20/2023] Open
Abstract
Linkage disequilibrium (LD) is a fundamental concept in genetics; critical for studying genetic associations and molecular evolution. However, LD measurements are only reliable for common genetic variants, leaving low-frequency variants unanalyzed. In this work, we introduce cumulative LD (cLD), a stable statistic that captures the rare-variant LD between genetic regions, which reflects more biological interactions between variants, in addition to lack of recombination. We derived the theoretical variance of cLD using delta methods to demonstrate its higher stability than LD for rare variants. This property is also verified by bootstrapped simulations using real data. In application, we find cLD reveals an increased genetic association between genes in 3D chromatin interactions, a phenomenon recently reported negatively by calculating standard LD between common variants. Additionally, we show that cLD is higher between gene pairs reported in interaction databases, identifies unreported protein-protein interactions, and reveals interacting genes distinguishing case/control samples in association studies.
Collapse
Affiliation(s)
- Dinghao Wang
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
| | - Deshan Perera
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
| | - Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
| | - Chen Cao
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
| | - Pathum Kossinna
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
| | - William Zhang
- The Harker School, San Jose, California, United States of America
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Alexander Platt
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jingjing Wu
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
| | - Qingrun Zhang
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
3
|
Hébert F, Causeur D, Emily M. Omnibus testing approach for gene-based gene-gene interaction. Stat Med 2022; 41:2854-2878. [PMID: 35338506 DOI: 10.1002/sim.9389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Revised: 03/03/2022] [Accepted: 03/04/2022] [Indexed: 11/07/2022]
Abstract
Genetic interaction is considered as one of the main heritable component of complex traits. With the emergence of genome-wide association studies (GWAS), a collection of statistical methods dedicated to the identification of interaction at the SNP level have been proposed. More recently, gene-based gene-gene interaction testing has emerged as an attractive alternative as they confer advantage in both statistical power and biological interpretation. Most of the gene-based interaction methods rely on a multidimensional modeling of the interaction, thus facing a lack of robustness against the huge space of interaction patterns. In this paper, we study a global testing approaches to address the issue of gene-based gene-gene interaction. Based on a logistic regression modeling framework, all SNP-SNP interaction tests are combined to produce a gene-level test for interaction. We propose an omnibus test that takes advantage of (1) the heterogeneity between existing global tests and (2) the complementarity between allele-based and genotype-based coding of SNPs. Through an extensive simulation study, it is demonstrated that the proposed omnibus test has the ability to detect with high power the most common interaction genetic models with one causal pair as well as more complex genetic models where more than one causal pair is involved. On the other hand, the flexibility of the proposed approach is shown to be robust and improves power compared to single global tests in replication studies. Furthermore, the application of our procedure to real datasets confirms the adaptability of our approach to replicate various gene-gene interactions.
Collapse
Affiliation(s)
- Florian Hébert
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| | - David Causeur
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| | - Mathieu Emily
- Department of Statistics and Computer Science, Institut Agro, CNRS, IRMAR, Univ Rennes, F-35000, Rennes, France
| |
Collapse
|
4
|
Guo Y, Cheng H, Yuan Z, Liang Z, Wang Y, Du D. Testing Gene-Gene Interactions Based on a Neighborhood Perspective in Genome-wide Association Studies. Front Genet 2021; 12:801261. [PMID: 34956337 PMCID: PMC8693929 DOI: 10.3389/fgene.2021.801261] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 11/15/2021] [Indexed: 12/21/2022] Open
Abstract
Unexplained genetic variation that causes complex diseases is often induced by gene-gene interactions (GGIs). Gene-based methods are one of the current statistical methodologies for discovering GGIs in case-control genome-wide association studies that are not only powerful statistically, but also interpretable biologically. However, most approaches include assumptions about the form of GGIs, which results in poor statistical performance. As a result, we propose gene-based testing based on the maximal neighborhood coefficient (MNC) called gene-based gene-gene interaction through a maximal neighborhood coefficient (GBMNC). MNC is a metric for capturing a wide range of relationships between two random vectors with arbitrary, but not necessarily equal, dimensions. We established a statistic that leverages the difference in MNC in case and in control samples as an indication of the existence of GGIs, based on the assumption that the joint distribution of two genes in cases and controls should not be substantially different if there is no interaction between them. We then used a permutation-based statistical test to evaluate this statistic and calculate a statistical p-value to represent the significance of the interaction. Experimental results using both simulation and real data showed that our approach outperformed earlier methods for detecting GGIs.
Collapse
Affiliation(s)
- Yingjie Guo
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Honghong Cheng
- School of Information, Shanxi University of Finance and Economics, Taiyuan, China
| | - Zhian Yuan
- Research Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China
| | - Zhen Liang
- School of Life Science, Shanxi University, Taiyuan, China
| | - Yang Wang
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Debing Du
- Beidahuang Industry Group General Hospital, Harbin, China
| |
Collapse
|
5
|
Guo Y, Wu C, Yuan Z, Wang Y, Liang Z, Wang Y, Zhang Y, Xu L. Gene-Based Testing of Interactions Using XGBoost in Genome-Wide Association Studies. Front Cell Dev Biol 2021; 9:801113. [PMID: 34977040 PMCID: PMC8716787 DOI: 10.3389/fcell.2021.801113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 11/23/2021] [Indexed: 11/30/2022] Open
Abstract
Among the myriad of statistical methods that identify gene–gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene–gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical p-value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene–gene interactions.
Collapse
Affiliation(s)
- Yingjie Guo
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Chenxi Wu
- Department of Mathematics, University of Wisconsin-Madison, Madison, WI, United States
| | - Zhian Yuan
- Research Institute of Big Data Science and Industry, Shanxi University, Taiyuan, China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Zhen Liang
- School of Life Science, Shanxi University, Taiyuan, China
| | - Yang Wang
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Yi Zhang
- Beidahuang Industry Group General Hospital, Harbin, China
- *Correspondence: Yi Zhang, ; Lei Xu,
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
- *Correspondence: Yi Zhang, ; Lei Xu,
| |
Collapse
|
6
|
Won JH, Zhou H, Lange K. ORTHOGONAL TRACE-SUM MAXIMIZATION: APPLICATIONS, LOCAL ALGORITHMS, AND GLOBAL OPTIMALITY. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS : A PUBLICATION OF THE SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS 2021; 42:859-882. [PMID: 34776610 PMCID: PMC8589322 DOI: 10.1137/20m1363388] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This paper studies the problem of maximizing the sum of traces of matrix quadratic forms on a product of Stiefel manifolds. This orthogonal trace-sum maximization (OTSM) problem generalizes many interesting problems such as generalized canonical correlation analysis (CCA), Procrustes analysis, and cryo-electron microscopy of the Nobel prize fame. For these applications finding global solutions is highly desirable, but it has been unclear how to find even a stationary point, let alone test its global optimality. Through a close inspection of Ky Fan's classical result [Proc. Natl. Acad. Sci. USA, 35 (1949), pp. 652-655] on the variational formulation of the sum of largest eigenvalues of a symmetric matrix, and a semidefinite programming (SDP) relaxation of the latter, we first provide a simple method to certify global optimality of a given stationary point of OTSM. This method only requires testing whether a symmetric matrix is positive semidefinite. A by-product of this analysis is an unexpected strong duality between Shapiro and Botha [SIAM J. Matrix Anal. Appl., 9 (1988), pp. 378-383] and Zhang and Singer [Linear Algebra Appl., 524 (2017), pp. 159-181]. After showing that a popular algorithm for generalized CCA and Procrustes analysis may generate oscillating iterates, we propose a simple fix that provably guarantees convergence to a stationary point. The combination of our algorithm and certificate reveals novel global optima of various instances of OTSM.
Collapse
Affiliation(s)
- Joong-Ho Won
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1766 USA
| | - Kenneth Lange
- Departments of Computational Medicine, Human Genetics, and Statistics, University of California, Los Angeles, CA 90095 USA
| |
Collapse
|
7
|
Ashad Alam M, Komori O, Deng HW, Calhoun VD, Wang YP. Robust kernel canonical correlation analysis to detect gene-gene co-associations: A case study in genetics. J Bioinform Comput Biol 2020; 17:1950028. [PMID: 31617462 DOI: 10.1142/s0219720019500288] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.
Collapse
Affiliation(s)
- Md Ashad Alam
- Tulane Center of Bioinformatics and Genomics, Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA 70118, USA
| | - Osamu Komori
- Department of Computer and Information Science, Seikei University 3-3-1 Kichijojikitamachi, Musashino-shi Tokyo 180-8633 Japan
| | - Hong-Wen Deng
- Tulane Center of Bioinformatics and Genomics, Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA 70118, USA
| | - Vince D Calhoun
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science, Georgia State University, Georgia Institute of Technology, Emory University, Atlanta, GA30302, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|
8
|
Kogan V, Millstein J, London SJ, Ober C, White SR, Naureckas ET, Gauderman WJ, Jackson DJ, Barraza-Villarreal A, Romieu I, Raby BA, Breton CV. Genetic-Epigenetic Interactions in Asthma Revealed by a Genome-Wide Gene-Centric Search. Hum Hered 2019; 83:130-152. [PMID: 30669148 PMCID: PMC7365350 DOI: 10.1159/000489765] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
OBJECTIVES There is evidence to suggest that asthma pathogenesis is affected by both genetic and epigenetic variation independently, and there is some evidence to suggest that genetic-epigenetic interactions affect risk of asthma. However, little research has been done to identify such interactions on a genome-wide scale. The aim of this studies was to identify genes with genetic-epigenetic interactions associated with asthma. METHODS Using asthma case-control data, we applied a novel nonparametric gene-centric approach to test for interactions between multiple SNPs and CpG sites simultaneously in the vicinities of 18,178 genes across the genome. RESULTS Twelve genes, PF4, ATF3, TPRA1, HOPX, SCARNA18, STC1, OR10K1, UPK1B, LOC101928523, LHX6, CHMP4B, and LANCL1, exhibited statistically significant SNP-CpG interactions (false discovery rate = 0.05). Of these, three have previously been implicated in asthma risk (PF4, ATF3, and TPRA1). Follow-up analysis revealed statistically significant pairwise SNP-CpG interactions for several of these genes, including SCARNA18, LHX6, and LOC101928523 (p = 1.33E-04, 8.21E-04, 1.11E-03, respectively). CONCLUSIONS Joint effects of genetic and epigenetic variation may play an important role in asthma pathogenesis. Statistical methods that simultaneously account for multiple variations across chromosomal regions may be needed to detect these types of effects on a genome-wide scale.
Collapse
Affiliation(s)
- Vladimir Kogan
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA
| | - Joshua Millstein
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA,
| | - Stephanie J London
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, RTP, Research Triangle Park, North Carolina, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, Illinois, USA
| | - Steven R White
- Department of Medicine, University of Chicago, Chicago, Illinois, USA
| | | | - W James Gauderman
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA
| | - Daniel J Jackson
- University of Wisconsin School of Medicine and Public Health, Madison, Illinois, USA
| | - Albino Barraza-Villarreal
- Department of Environmental Health, Population Health Center, National Institute of Public Health of Mexico, Cuernavaca, Mexico
| | - Isabelle Romieu
- International Agency for Research on Cancer, Section of Nutrition and Metabolism, Lyon, France
| | - Benjamin A Raby
- Department of Medicine, Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Carrie V Breton
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA
| |
Collapse
|
9
|
Gene-Based Nonparametric Testing of Interactions Using Distance Correlation Coefficient in Case-Control Association Studies. Genes (Basel) 2018; 9:genes9120608. [PMID: 30563156 PMCID: PMC6316506 DOI: 10.3390/genes9120608] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 11/24/2018] [Accepted: 11/27/2018] [Indexed: 12/12/2022] Open
Abstract
Among the various statistical methods for identifying gene⁻gene interactions in qualitative genome-wide association studies (GWAS), gene-based methods have recently grown in popularity because they confer advantages in both statistical power and biological interpretability. However, most of these methods make strong assumptions about the form of the relationship between traits and single-nucleotide polymorphisms, which result in limited statistical power. In this paper, we propose a gene-based method based on the distance correlation coefficient called gene-based gene-gene interaction via distance correlation coefficient (GBDcor). The distance correlation (dCor) is a measurement of the dependency between two random vectors with arbitrary, and not necessarily equal, dimensions. We used the difference in dCor in case and control datasets as an indicator of gene⁻gene interaction, which was based on the assumption that the joint distribution of two genes in case subjects and in control subjects should not be significantly different if the two genes do not interact. We designed a permutation-based statistical test to evaluate the difference between dCor in cases and controls for a pair of genes, and we provided the p-value for the statistic to represent the significance of the interaction between the two genes. In experiments with both simulated and real-world data, our method outperformed previous approaches in detecting interactions accurately.
Collapse
|
10
|
Alam MA, Lin HY, Deng HW, Calhoun VD, Wang YP. A kernel machine method for detecting higher order interactions in multimodal datasets: Application to schizophrenia. J Neurosci Methods 2018; 309:161-174. [PMID: 30184473 DOI: 10.1016/j.jneumeth.2018.08.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 08/12/2018] [Accepted: 08/30/2018] [Indexed: 12/20/2022]
Abstract
BACKGROUND Technological advances are enabling us to collect multimodal datasets at an increasing depth and resolution while with decreasing labors. Understanding complex interactions among multimodal datasets, however, is challenging. NEW METHOD In this study, we tested the interaction effect of multimodal datasets using a novel method called the kernel machine for detecting higher order interactions among biologically relevant multimodal data. Using a semiparametric method on a reproducing kernel Hilbert space, we formulated the proposed method as a standard mixed-effects linear model and derived a score-based variance component statistic to test higher order interactions between multimodal datasets. RESULTS The method was evaluated using extensive numerical simulation and real data from the Mind Clinical Imaging Consortium with both schizophrenia patients and healthy controls. Our method identified 13-triplets that included 6 gene-derived SNPs, 10 ROIs, and 6 gene-specific DNA methylations that are correlated with the changes in hippocampal volume, suggesting that these triplets may be important for explaining schizophrenia-related neurodegeneration. COMPARISON WITH EXISTING METHOD(S) The performance of the proposed method is compared with the following methods: test based on only first and first few principal components followed by multiple regression, and full principal component analysis regression, and the sequence kernel association test. CONCLUSIONS With strong evidence (p-value ≤0.000001), the triplet (MAGI2, CRBLCrus1.L, FBXO28) is a significant biomarker for schizophrenia patients. This novel method can be applicable to the study of other disease processes, where multimodal data analysis is a common task.
Collapse
Affiliation(s)
- Md Ashad Alam
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA.
| | - Hui-Yi Lin
- Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA 70112, USA
| | - Hong-Wen Deng
- Center for Bioinformatics and Genomics, Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA 70112, USA
| | - Vince D Calhoun
- Department of Electrical and Computer Engineering, The University of New Mexico, Albuquerque, NM 87131, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|
11
|
Fang YH, Wang JH, Hsiung CA. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions. Bioinformatics 2018. [PMID: 28651334 DOI: 10.1093/bioinformatics/btx409] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation Identification of single nucleotide polymorphism (SNP) interactions is an important and challenging topic in genome-wide association studies (GWAS). Many approaches have been applied to detecting whole-genome interactions. However, these approaches to interaction analysis tend to miss causal interaction effects when the individual marginal effects are uncorrelated to trait, while their interaction effects are highly associated with the trait. Results A grouped variable selection technique, called two-stage grouped sure independence screening (TS-GSIS), is developed to study interactions that may not have marginal effects. The proposed TS-GSIS is shown to be very helpful in identifying not only causal SNP effects that are uncorrelated to trait but also their corresponding SNP-SNP interaction effects. The benefit of TS-GSIS are gaining detection of interaction effects by taking the joint information among the SNPs and determining the size of candidate sets in the model. Simulation studies under various scenarios are performed to compare performance of TS-GSIS and current approaches. We also apply our approach to a real rheumatoid arthritis (RA) dataset. Both the simulation and real data studies show that the TS-GSIS performs very well in detecting SNP-SNP interactions. Availability and implementation R-package is delivered through CRAN and is available at: https://cran.r-project.org/web/packages/TSGSIS/index.html. Contact hsiung@nhri.org.tw. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yao-Hwei Fang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Jie-Huei Wang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| | - Chao A Hsiung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 35053, Taiwan
| |
Collapse
|
12
|
Abstract
BACKGROUND A large amount of research has been devoted to the detection and investigation of epistatic interactions in genome-wide association studies (GWASs). Most of the literature focuses on low-order interactions between single-nucleotide polymorphisms (SNPs) with significant main effects. RESULTS In this paper we propose an original approach for detecting epistasis at the gene level, without systematically filtering on significant genes. We first compute interaction variables for each gene pair by finding its Eigen-Epistasis component, defined as the linear combination of Gene SNPs having the highest correlation with the phenotype. The selection of significant effects is done using a penalized regression method based on Group Lasso controlling the False Discovery Rate. CONCLUSION The method is tested against two recent alternative proposals from the literature using synthetic data, and shows good performances in different settings. We demonstrate the power of our approach by detecting new gene-gene interactions on three genome-wide association studies.
Collapse
|
13
|
Xu J, Yuan Z, Ji J, Zhang X, Li H, Wu X, Xue F, Liu Y. A powerful score-based test statistic for detecting gene-gene co-association. BMC Genet 2016; 17:31. [PMID: 26822525 PMCID: PMC4731962 DOI: 10.1186/s12863-016-0331-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 01/13/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. RESULTS Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. CONCLUSIONS SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
Collapse
Affiliation(s)
- Jing Xu
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Jiadong Ji
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Xiaoshuai Zhang
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Hongkai Li
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Xuesen Wu
- Department of Epidemiology and Statistics, Bengbu Medical College at Bengbu, Anhui, 233030, China.
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| | - Yanxun Liu
- Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.
| |
Collapse
|
14
|
Emily M. AGGrEGATOr: A Gene-based GEne-Gene interActTiOn test for case-control association studies. Stat Appl Genet Mol Biol 2016; 15:151-71. [DOI: 10.1515/sagmb-2015-0074] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractAmong the large of number of statistical methods that have been proposed to identify gene-gene interactions in case-control genome-wide association studies (GWAS), gene-based methods have recently grown in popularity as they confer advantage in both statistical power and biological interpretation. All of the gene-based methods jointly model the distribution of single nucleotide polymorphisms (SNPs) sets prior to the statistical test, leading to a limited power to detect sums of SNP-SNP signals. In this paper, we instead propose a gene-based method that first performs SNP-SNP interaction tests before aggregating the obtained
Collapse
|
15
|
A gene-based information gain method for detecting gene-gene interactions in case-control studies. Eur J Hum Genet 2015; 23:1566-72. [PMID: 25758991 DOI: 10.1038/ejhg.2015.16] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Revised: 11/30/2014] [Accepted: 01/14/2015] [Indexed: 12/31/2022] Open
Abstract
Currently, most methods for detecting gene-gene interactions (GGIs) in genome-wide association studies are divided into SNP-based methods and gene-based methods. Generally, the gene-based methods can be more powerful than SNP-based methods. Some gene-based entropy methods can only capture the linear relationship between genes. We therefore proposed a nonparametric gene-based information gain method (GBIGM) that can capture both linear relationship and nonlinear correlation between genes. Through simulation with different odds ratio, sample size and prevalence rate, GBIGM was shown to be valid and more powerful than classic KCCU method and SNP-based entropy method. In the analysis of data from 17 genes on rheumatoid arthritis, GBIGM was more effective than the other two methods as it obtains fewer significant results, which was important for biological verification. Therefore, GBIGM is a suitable and powerful tool for detecting GGIs in case-control studies.
Collapse
|
16
|
Hu JK, Wang X, Wang P. Testing gene-gene interactions in genome wide association studies. Genet Epidemiol 2014; 38:123-34. [PMID: 24431225 DOI: 10.1002/gepi.21786] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2013] [Revised: 10/11/2013] [Accepted: 12/02/2013] [Indexed: 11/07/2022]
Abstract
Detection of gene-gene interaction has become increasingly popular over the past decade in genome wide association studies (GWAS). Besides traditional logistic regression analysis for detecting interactions between two markers, new methods have been developed in recent years such as comparing linkage disequilibrium (LD) in case and control groups. All these methods form the building blocks of most screening strategies for disease susceptibility loci in GWAS. In this paper, we are interested in comparing the competing methods and providing practical guidelines for selecting appropriate testing methods for interaction in GWAS. We first review a series of existing statistical methods to detect interactions, and then examine different definitions of interactions to gain insight into the theoretical relationship between the existing testing methods. Lastly, we perform extensive simulations to compare powers of various methods to detect either interaction between two markers at two unlinked loci or the overall association allowing for both interaction and main effects. This investigation reveals informative characteristics of various methods that are helpful to GWAS investigators.
Collapse
Affiliation(s)
- Jie Kate Hu
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | | | | |
Collapse
|
17
|
Larson NB, Jenkins GD, Larson MC, Vierkant RA, Sellers TA, Phelan CM, Schildkraut JM, Sutphen R, Pharoah PPD, Gayther SA, Wentzensen N, Goode EL, Fridley BL. Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer. Eur J Hum Genet 2014; 22:126-31. [PMID: 23591404 PMCID: PMC3865403 DOI: 10.1038/ejhg.2013.69] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Revised: 01/11/2013] [Accepted: 01/16/2013] [Indexed: 01/24/2023] Open
Abstract
Although single-locus approaches have been widely applied to identify disease-associated single-nucleotide polymorphisms (SNPs), complex diseases are thought to be the product of multiple interactions between loci. This has led to the recent development of statistical methods for detecting statistical interactions between two loci. Canonical correlation analysis (CCA) has previously been proposed to detect gene-gene coassociation. However, this approach is limited to detecting linear relations and can only be applied when the number of observations exceeds the number of SNPs in a gene. This limitation is particularly important for next-generation sequencing, which could yield a large number of novel variants on a limited number of subjects. To overcome these limitations, we propose an approach to detect gene-gene interactions on the basis of a kernelized version of CCA (KCCA). Our simulation studies showed that KCCA controls the Type-I error, and is more powerful than leading gene-based approaches under a disease model with negligible marginal effects. To demonstrate the utility of our approach, we also applied KCCA to assess interactions between 200 genes in the NF-κB pathway in relation to ovarian cancer risk in 3869 cases and 3276 controls. We identified 13 significant gene pairs relevant to ovarian cancer risk (local false discovery rate <0.05). Finally, we discuss the advantages of KCCA in gene-gene interaction analysis and its future role in genetic association studies.
Collapse
Affiliation(s)
- Nicholas B Larson
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Gregory D Jenkins
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Melissa C Larson
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Robert A Vierkant
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | | | | | | | - Rebecca Sutphen
- Department of Pediatrics, Universty of South Florida College of Medicine, Tampa, FL, USA
| | | | - Simon A Gayther
- Department of Preventative Medicine, University of Southern California, Los Angeles, CA, USA
| | - Nicolas Wentzensen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Ovarian Cancer Association Consortium
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
- Cancer Epidemiology, Moffitt Cancer Center, Tampa, FL, USA
- Duke Comprehensive Cancer Center, Duke University, Durham, NC, USA
- Department of Pediatrics, Universty of South Florida College of Medicine, Tampa, FL, USA
- Department of Oncology, University of Cambridge, Cambridge, UK
- Department of Preventative Medicine, University of Southern California, Los Angeles, CA, USA
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS, USA
| | - Ellen L Goode
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Brooke L Fridley
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS, USA
| |
Collapse
|
18
|
Winham SJ, Biernacka JM. Gene-environment interactions in genome-wide association studies: current approaches and new directions. J Child Psychol Psychiatry 2013; 54:1120-34. [PMID: 23808649 PMCID: PMC3829379 DOI: 10.1111/jcpp.12114] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/03/2013] [Indexed: 01/20/2023]
Abstract
BACKGROUND Complex psychiatric traits have long been thought to be the result of a combination of genetic and environmental factors, and gene-environment interactions are thought to play a crucial role in behavioral phenotypes and the susceptibility and progression of psychiatric disorders. Candidate gene studies to investigate hypothesized gene-environment interactions are now fairly common in human genetic research, and with the shift toward genome-wide association studies, genome-wide gene-environment interaction studies are beginning to emerge. METHODS We summarize the basic ideas behind gene-environment interaction, and provide an overview of possible study designs and traditional analysis methods in the context of genome-wide analysis. We then discuss novel approaches beyond the traditional strategy of analyzing the interaction between the environmental factor and each polymorphism individually. RESULTS Two-step filtering approaches that reduce the number of polymorphisms tested for interactions can substantially increase the power of genome-wide gene-environment studies. New analytical methods including data-mining approaches, and gene-level and pathway-level analyses, also have the capacity to improve our understanding of how complex genetic and environmental factors interact to influence psychologic and psychiatric traits. Such methods, however, have not yet been utilized much in behavioral and mental health research. CONCLUSIONS Although methods to investigate gene-environment interactions are available, there is a need for further development and extension of these methods to identify gene-environment interactions in the context of genome-wide association studies. These novel approaches need to be applied in studies of psychology and psychiatry.
Collapse
Affiliation(s)
- Stacey J Winham
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905
| | - Joanna M. Biernacka
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905,Department of Psychiatry and Psychology, Mayo Clinic, Rochester MN 55905
| |
Collapse
|
19
|
From interaction to co-association --a Fisher r-to-z transformation-based simple statistic for real world genome-wide association study. PLoS One 2013; 8:e70774. [PMID: 23923021 PMCID: PMC3726765 DOI: 10.1371/journal.pone.0070774] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 06/21/2013] [Indexed: 12/21/2022] Open
Abstract
Currently, the genetic variants identified by genome wide association study (GWAS) generally only account for a small proportion of the total heritability for complex disease. One crucial reason is the underutilization of gene-gene joint effects commonly encountered in GWAS, which includes their main effects and co-association. However, gene-gene co-association is often customarily put into the framework of gene-gene interaction vaguely. From the causal graph perspective, we elucidate in detail the concept and rationality of gene-gene co-association as well as its relationship with traditional gene-gene interaction, and propose two Fisher r-to-z transformation-based simple statistics to detect it. Three series of simulations further highlight that gene-gene co-association refers to the extent to which the joint effects of two genes differs from the main effects, not only due to the traditional interaction under the nearly independent condition but the correlation between two genes. The proposed statistics are more powerful than logistic regression under various situations, cannot be affected by linkage disequilibrium and can have acceptable false positive rate as long as strictly following the reasonable GWAS data analysis roadmap. Furthermore, an application to gene pathway analysis associated with leprosy confirms in practice that our proposed gene-gene co-association concepts as well as the correspondingly proposed statistics are strongly in line with reality.
Collapse
|
20
|
Zhang X, Yang X, Yuan Z, Liu Y, Li F, Peng B, Zhu D, Zhao J, Xue F. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design. PLoS One 2013; 8:e62129. [PMID: 23620809 PMCID: PMC3631168 DOI: 10.1371/journal.pone.0062129] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2012] [Accepted: 03/19/2013] [Indexed: 12/22/2022] Open
Abstract
For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Xiaowei Yang
- Hunter College - School of Public Health, City University of New York, New York City, New York, United States of America
- Bayessoft, Inc., Davis, California, United States of America
| | - Zhongshang Yuan
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Yanxun Liu
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Fangyu Li
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Bin Peng
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Dianwen Zhu
- Hunter College - School of Public Health, City University of New York, New York City, New York, United States of America
| | - Jinghua Zhao
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Fuzhong Xue
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
- * E-mail:
| |
Collapse
|
21
|
Yuan Z, Gao Q, He Y, Zhang X, Li F, Zhao J, Xue F. Detection for gene-gene co-association via kernel canonical correlation analysis. BMC Genet 2012; 13:83. [PMID: 23039928 PMCID: PMC3506484 DOI: 10.1186/1471-2156-13-83] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2012] [Accepted: 09/28/2012] [Indexed: 11/15/2022] Open
Abstract
Background Currently, most methods for detecting gene-gene interaction (GGI) in genomewide association studies (GWASs) are limited in their use of single nucleotide polymorphism (SNP) as the unit of association. One way to address this drawback is to consider higher level units such as genes or regions in the analysis. Earlier we proposed a statistic based on canonical correlations (CCU) as a gene-based method for detecting gene-gene co-association. However, it can only capture linear relationship and not nonlinear correlation between genes. We therefore proposed a counterpart (KCCU) based on kernel canonical correlation analysis (KCCA). Results Through simulation the KCCU statistic was shown to be a valid test and more powerful than CCU statistic with respect to sample size and interaction odds ratio. Analysis of data from regions involving three genes on rheumatoid arthritis (RA) from Genetic Analysis Workshop 16 (GAW16) indicated that only KCCU statistic was able to identify interactions reported earlier. Conclusions KCCU statistic is a valid and powerful gene-based method for detecting gene-gene co-association.
Collapse
Affiliation(s)
- Zhongshang Yuan
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, 250012, China
| | | | | | | | | | | | | |
Collapse
|
22
|
Kumar D, Chakraborty J, Das S. Epistatic effects between variants of kappa-opioid receptor gene and A118G of mu-opioid receptor gene increase susceptibility to addiction in Indian population. Prog Neuropsychopharmacol Biol Psychiatry 2012; 36:225-30. [PMID: 22138325 DOI: 10.1016/j.pnpbp.2011.10.018] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2011] [Revised: 10/01/2011] [Accepted: 10/31/2011] [Indexed: 11/30/2022]
Abstract
OBJECTIVE Unequivocal evidence suggests contribution of κ-opioid receptor (KOR) in addiction to drugs of abuse. A study was undertaken to identify the single nucleotide polymorphisms (SNP) at selective areas of kappa opioid receptor 1 (OPRK1) gene in heroin as well as in alcohol addicts and to compare them with that in control population. The potential interaction of the identified KOR SNPs with A118G of μ opioid receptor was also investigated. METHODS Two hundred control subjects, one hundred thirty heroin and one hundred ten alcohol addicts, all male and residing in Kolkata, a city in eastern India, volunteered for the study. Exons 3 and 4 of OPRK1 and the SNP, A118G of mu opioid receptor 1 (OPRM1) in the DNA samples were genotyped by sequencing and restriction fragment length polymorphism respectively. The SNPs identified in the population were analyzed by odds ratio and its corresponding 95% confidence interval was estimated using logistic regression models. SNP-SNP interactions were also investigated. RESULTS Three SNPs of OPRK1, rs16918875, rs702764 and rs963549, were identified in the population, none of which showed significant association with addiction. On the other hand, significant association was observed for A118G with heroin addiction (χ²=7.268, P=0.0264) as well as with alcoholic addition (χ²=6.626, P=0.0364). A potential SNP-SNP interaction showed that the odds of being addicted was 2.51 fold in heroin subjects [CI (95%)=1.1524 to 5.4947, P=0.0206] and 2.31 fold in alcoholics [CI (95%)=1.025 to 5.24, P=0.0433] with the OPRK1 (rs16918875) and A118G risk alleles than without either. A significant interaction was also identified between GG/AG of A118G and GG of rs702764 [O.R (95%)=2.04 (1.279 to 3.287), P=0.0029] in case of opioid population. CONCLUSION Our study suggests that set associations of polymorphisms may be important in determining the risk profile for complex diseases such as addiction.
Collapse
Affiliation(s)
- Deepak Kumar
- Neurobiology Division, Indian Institute of Chemical Biology, 4 Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, India
| | | | | |
Collapse
|
23
|
Xue F, Li S, Luan J, Yuan Z, Luben RN, Khaw KT, Wareham NJ, Loos RJF, Zhao JH. A latent variable partial least squares path modeling approach to regional association and polygenic effect with applications to a human obesity study. PLoS One 2012; 7:e31927. [PMID: 22384102 PMCID: PMC3288051 DOI: 10.1371/journal.pone.0031927] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 01/18/2012] [Indexed: 01/10/2023] Open
Abstract
Genetic association studies are now routinely used to identify single nucleotide polymorphisms (SNPs) linked with human diseases or traits through single SNP-single trait tests. Here we introduced partial least squares path modeling (PLSPM) for association between single or multiple SNPs and a latent trait that can involve single or multiple correlated measurement(s). Furthermore, the framework naturally provides estimators of polygenic effect by appropriately weighting trait-attributing alleles. We conducted computer simulations to assess the performance via multiple SNPs and human obesity-related traits as measured by body mass index (BMI), waist and hip circumferences. Our results showed that the associate statistics had type I error rates close to nominal level and were powerful for a range of effect and sample sizes. When applied to 12 candidate regions in data (N = 2,417) from the European Prospective Investigation of Cancer (EPIC)-Norfolk study, a region in FTO was found to have stronger association (rs7204609∼rs9939881 at the first intron P = 4.29×10(-7)) than single SNP analysis (all with P>10(-4)) and a latent quantitative phenotype was obtained using a subset sample of EPIC-Norfolk (N = 12,559). We believe our method is appropriate for assessment of regional association and polygenic effect on a single or multiple traits.
Collapse
Affiliation(s)
- Fuzhong Xue
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Shengxu Li
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Jian'an Luan
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Zhongshang Yuan
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Robert N. Luben
- Strangeways Research Laboratory, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Kay-Tee Khaw
- Clinical Gerontology Unit, School of Clinical Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Nicholas J. Wareham
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Ruth J. F. Loos
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Jing Hua Zhao
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
24
|
Gumus E, Kursun O, Sertbas A, Ustek D. Application of canonical correlation analysis for identifying viral integration preferences. Bioinformatics 2012; 28:651-5. [DOI: 10.1093/bioinformatics/bts027] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
25
|
|
26
|
Abstract
Over the last few years, main effect genetic association analysis has proven to be a successful tool to unravel genetic risk components to a variety of complex diseases. In the quest for disease susceptibility factors and the search for the 'missing heritability', supplementary and complementary efforts have been undertaken. These include the inclusion of several genetic inheritance assumptions in model development, the consideration of different sources of information, and the acknowledgement of disease underlying pathways of networks. The search for epistasis or gene-gene interaction effects on traits of interest is marked by an exponential growth, not only in terms of methodological development, but also in terms of practical applications, translation of statistical epistasis to biological epistasis and integration of omics information sources. The current popularity of the field, as well as its attraction to interdisciplinary teams, each making valuable contributions with sometimes rather unique viewpoints, renders it impossible to give an exhaustive review of to-date available approaches for epistasis screening. The purpose of this work is to give a perspective view on a selection of currently active analysis strategies and concerns in the context of epistasis detection, and to provide an eye to the future of gene-gene interaction analysis.
Collapse
Affiliation(s)
- Kristel Van Steen
- Department of Electrical Engineering and Computer Science (Montefiore Institute), Grande Traverse, Bioinformatique 4000 Liège 1, Belgium.
| |
Collapse
|