1
|
Park M, Shin JE, Yee J, Ahn YM, Joo EJ. Gene-gene interaction analysis for age at onset of bipolar disorder in a Korean population. J Affect Disord 2024; 361:97-103. [PMID: 38834091 DOI: 10.1016/j.jad.2024.05.152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 05/24/2024] [Accepted: 05/28/2024] [Indexed: 06/06/2024]
Abstract
BACKGROUND Multiple genes might interact to determine the age at onset of bipolar disorder. We investigated gene-gene interactions related to age at onset of bipolar disorder in the Korean population, using genome-wide association study (GWAS) data. METHODS The study population consisted of 303 patients with bipolar disorder. First, the top 1000 significant single-nucleotide polymorphisms (SNPs) associated with age at onset of bipolar disorder were selected through single SNP analysis by simple linear regression. Subsequently, the QMDR method was used to find gene-gene interactions. RESULTS The best 10 SNPs from simple regression were located in chromosome 1, 2, 3, 10, 11, 14, 19, and 21. Only five SNPs were found in several genes, such as FOXN3, KIAA1217, OPCML, CAMSAP2, and PTPRS. On QMDR analyses, five pairs of SNPs showed significant interactions with a CVC exceeding 1/5 in a two-locus model. The best interaction was found for the pair of rs60830549 and rs12952733 (CVC = 1/5, P < 1E-07). In three-locus models, four combinations of SNPs showed significant associations with age at onset, with a CVC of >1/5. The best three-locus combination was rs60830549, rs12952733, and rs12952733 (CVC = 2/5, P < 1E-6). The SNPs showing significant interactions were located in the KIAA1217, RBFOX3, SDK2, CYP19A1, NTM, SMYD3, and RBFOX1 genes. CONCLUSIONS Our analysis confirmed genetic interactions influencing the age of onset for bipolar disorder and identified several potential candidate genes. Further exploration of the functions of these promising genes, which may have multiple roles within the neuronal network, is necessary.
Collapse
Affiliation(s)
- Mira Park
- Department of Preventive Medicine, School of Medicine, Eulji University, Daejeon, Republic of Korea
| | - Ji-Eun Shin
- Department of Biomedical Informatics, School of Medicine, Konyang University, Daejeon, Republic of Korea
| | - Jaeyong Yee
- Department of Physiology and Biophysics, School of Medicine, Eulji University, Daejeon, Republic of Korea
| | - Yong Min Ahn
- Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea; Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea
| | - Eun-Jeong Joo
- Department of Psychiatry, Uijeongbu Eulji Medical Center, Eulji University, Gyeonggi, Republic of Korea; Department of Neuropsychiatry, School of Medicine, Eulji University, Daejeon, Republic of Korea.
| |
Collapse
|
2
|
Han L, Shen B, Wu X, Zhang J, Wen YJ. Compressed variance component mixed model reveals epistasis associated with flowering in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2024; 14:1283642. [PMID: 38259933 PMCID: PMC10800901 DOI: 10.3389/fpls.2023.1283642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 12/15/2023] [Indexed: 01/24/2024]
Abstract
Introduction Epistasis is currently a topic of great interest in molecular and quantitative genetics. Arabidopsis thaliana, as a model organism, plays a crucial role in studying the fundamental biology of diverse plant species. However, there have been limited reports about identification of epistasis related to flowering in genome-wide association studies (GWAS). Therefore, it is of utmost importance to conduct epistasis in Arabidopsis. Method In this study, we employed Levene's test and compressed variance component mixed model in GWAS to detect quantitative trait nucleotides (QTNs) and QTN-by-QTN interactions (QQIs) for 11 flowering-related traits of 199 Arabidopsis accessions with 216,130 markers. Results Our analysis detected 89 QTNs and 130 pairs of QQIs. Around these loci, 34 known genes previously reported in Arabidopsis were confirmed to be associated with flowering-related traits, such as SPA4, which is involved in regulating photoperiodic flowering, and interacts with PAP1 and PAP2, affecting growth of Arabidopsis under light conditions. Then, we observed significant and differential expression of 35 genes in response to variations in temperature, photoperiod, and vernalization treatments out of unreported genes. Functional enrichment analysis revealed that 26 of these genes were associated with various biological processes. Finally, the haplotype and phenotypic difference analysis revealed 20 candidate genes exhibiting significant phenotypic variations across gene haplotypes, of which the candidate genes AT1G12990 and AT1G09950 around QQIs might have interaction effect to flowering time regulation in Arabidopsis. Discussion These findings may offer valuable insights for the identification and exploration of genes and gene-by-gene interactions associated with flowering-related traits in Arabidopsis, that may even provide valuable reference and guidance for the research of epistasis in other species.
Collapse
Affiliation(s)
- Le Han
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Bolin Shen
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Xinyi Wu
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing, China
- State Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
| | - Yang-Jun Wen
- College of Science, Nanjing Agricultural University, Nanjing, China
- State Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|
3
|
Prieto-Fernández A, Sánchez-Barroso G, González-Domínguez J, García-Sanz-Calcedo J. Interaction between maintenance variables of medical ultrasound scanners through multifactor dimensionality reduction. Expert Rev Med Devices 2023; 20:851-864. [PMID: 37522639 DOI: 10.1080/17434440.2023.2243208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 06/14/2023] [Accepted: 06/22/2023] [Indexed: 08/01/2023]
Abstract
BACKGROUND Proper maintenance of electro-medical devices is crucial for the quality of care to patients and the economic performance of healthcare organizations. This research aims to identify the interaction between Ultrasound scanners (US) maintenance variables as a function of maintenance indicators: US in service or decommissioned, excessive number of failures, and failure rate. Knowing those interactions, specific maintenance measures will be developed to improve the reliability of the US. RESEARCH DESIGN AND METHODS Multifactor Dimensionality Reduction (MDR) method was eployed to analyze data from 222 US and their four-year maintenance history. Models were developed based on the variables with the greatest influence on maintenance indicators, where US were classified according to the associated risk. RESULTS US with more than one major failure or at least one major component replacement had up to 496.4% more failures than the average. Failure rate increased by up to 188.7% over the average for those US with more than three moderate failures, three replacements, or both. CONCLUSIONS This study identifies and quantifies the causes of risk to establish a specific maintenance plan for US. It helps to better understand the degradation of US to optimize their operation and maintenance.
Collapse
Affiliation(s)
| | - Gonzalo Sánchez-Barroso
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| | - Jaime González-Domínguez
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| | - Justo García-Sanz-Calcedo
- Engineering Projects Area, School of Industrial Engineering, University of Extremadura, Badajoz, Spain
| |
Collapse
|
4
|
Yang CH, Huang HC, Hou MF, Chuang LY, Lin YD. Fuzzy-Based Multiobjective Multifactor Dimensionality Reduction for Epistasis Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:378-387. [PMID: 35061588 DOI: 10.1109/tcbb.2022.3144303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Epistasis detection is vital for understanding disease susceptibility in genetics. Multiobjective multifactor dimensionality reduction (MOMDR) was previously proposed to detect epistasis. MOMDR was performed using binary classification to distinguish the high-risk (H) and low-risk (L) groups to reduce multifactor dimensionality. However, the binary classification does not reflect the uncertainty of the H and L classification. In this study, we proposed an empirical fuzzy MOMDR (EFMOMDR) to address the limitations of binary classification using the degree of membership through an empirical fuzzy approach. The EFMOMDR can simultaneously consider two incorporated fuzzy-based measures, including correct classification rate and likelihood rate, and does not require parameter tuning. Simulation studies revealed that EFMOMDR has higher 7.14% detection success rates than MOMDR, indicating that the limitations of binary classification of MOMDR have been successfully improved by empirical fuzzy. Moreover, EFMOMDR was used to analyze coronary artery disease in the Wellcome Trust Case Control Consortium dataset.
Collapse
|
5
|
Ott J, Park T. Overview of frequent pattern mining. Genomics Inform 2022; 20:e39. [PMID: 36617647 PMCID: PMC9847378 DOI: 10.5808/gi.22074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 12/22/2022] [Indexed: 12/31/2022] Open
Abstract
Various methods of frequent pattern mining have been applied to genetic problems, specifically, to the combined association of two genotypes (a genotype pattern, or diplotype) at different DNA variants with disease. These methods have the ability to come up with a selection of genotype patterns that are more common in affected than unaffected individuals, and the assessment of statistical significance for these selected patterns poses some unique problems, which are briefly outlined here.
Collapse
Affiliation(s)
- Jurg Ott
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10065, USA,Corresponding author E-mail:
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
6
|
The Enigmatic Etiology of Oculo-Auriculo-Vertebral Spectrum (OAVS): An Exploratory Gene Variant Interaction Approach in Candidate Genes. Life (Basel) 2022; 12:life12111723. [PMID: 36362878 PMCID: PMC9693117 DOI: 10.3390/life12111723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 10/12/2022] [Accepted: 10/24/2022] [Indexed: 11/17/2022] Open
Abstract
The clinical diagnosis of oculo-auriculo-vertebral spectrum (OAVS) is established when microtia is present in association with hemifacial hypoplasia (HH) and/or ocular, vertebral, and/or renal malformations. Genetic and non-genetic factors have been associated with microtia/OAVS. Although the etiology remains unknown in most patients, some cases may have an autosomal dominant, autosomal recessive, or multifactorial inheritance. Among the possible genetic factors, gene−gene interactions may play important roles in the etiology of complex diseases, but the literature lacks related reports in OAVS patients. Therefore, we performed a gene−variant interaction analysis within five microtia/OAVS candidate genes (HOXA2, TCOF1, SALL1, EYA1 and TBX1) in 49 unrelated OAVS Mexican patients (25 familial and 24 sporadic cases). A statistically significant intergenic interaction (p-value < 0.001) was identified between variants p.(Pro1099Arg) TCOF1 (rs1136103) and p.(Leu858=) SALL1 (rs1965024). This intergenic interaction may suggest that the products of these genes could participate in pathways related to craniofacial alterations, such as the retinoic acid (RA) pathway. The absence of clearly pathogenic variants in any of the analyzed genes does not support a monogenic etiology for microtia/OAVS involving these genes in our patients. Our findings could suggest that in addition to high-throughput genomic approaches, future gene−gene interaction analyses could contribute to improving our understanding of the etiology of microtia/OAVS.
Collapse
|
7
|
Yee J, Park T, Park M. Identification of the associations between genes and quantitative traits using entropy-based kernel density estimation. Genomics Inform 2022; 20:e17. [PMID: 35794697 PMCID: PMC9299569 DOI: 10.5808/gi.22033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/15/2022] [Indexed: 11/20/2022] Open
Abstract
Genetic associations have been quantified using a number of statistical measures. Entropy-based mutual information may be one of the more direct ways of estimating the association, in the sense that it does not depend on the parametrization. For this purpose, both the entropy and conditional entropy of the phenotype distribution should be obtained. Quantitative traits, however, do not usually allow an exact evaluation of entropy. The estimation of entropy needs a probability density function, which can be approximated by kernel density estimation. We have investigated the proper sequence of procedures for combining the kernel density estimation and entropy estimation with a probability density function in order to calculate mutual information. Genotypes and their interactions were constructed to set the conditions for conditional entropy. Extensive simulation data created using three types of generating functions were analyzed using two different kernels as well as two types of multifactor dimensionality reduction and another probability density approximation method called m-spacing. The statistical power in terms of correct detection rates was compared. Using kernels was found to be most useful when the trait distributions were more complex than simple normal or gamma distributions. A full-scale genomic dataset was explored to identify associations using the 2-h oral glucose tolerance test results and γ-glutamyl transpeptidase levels as phenotypes. Clearly distinguishable single-nucleotide polymorphisms (SNPs) and interacting SNP pairs associated with these phenotypes were found and listed with empirical p-values.
Collapse
Affiliation(s)
- Jaeyong Yee
- Department of Physiology and Biophysics, Eulji University, Daejeon 34824, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon 34824, Korea
| |
Collapse
|
8
|
Fisch GS. Associating complex traits with genetic variants: polygenic risk scores, pleiotropy and endophenotypes. Genetica 2021; 150:183-197. [PMID: 34677750 DOI: 10.1007/s10709-021-00138-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 10/07/2021] [Indexed: 11/29/2022]
Abstract
Genotype-phenotype causal modeling has evolved significantly since Johannsen's and Wright's original designs were published. The development of genomewide assays to interrogate and detect possible causal variants associated with complex traits has expanded the scope of genotype-phenotype research considerably. Clusters of causal variants discovered by genomewide assays and associated with complex traits have been used to develop polygenic risk scores to predict clinical diagnoses of multidimensional human disorders. However, genomewide investigations have met with many challenges to their research designs and statistical complexities which have hindered the reliability and validity of their predictions. Findings linked to differences in heritability estimates between causal clusters and complex traits among unrelated individuals remain a research area of some controversy. Causal models developed from case-control studies as opposed to experiments, as well as other issues concerning the genotype-phenotype causal model and the extent to which various forms of pleiotropy and the concept of the endophenotype add to its complexity, will be reviewed.
Collapse
Affiliation(s)
- Gene S Fisch
- Paul H. Chook Dept. of CIS & Statistics, CUNY/Baruch College, New York, NY, USA.
| |
Collapse
|
9
|
Machine Learning to Identify Interaction of Single-Nucleotide Polymorphisms as a Risk Factor for Chronic Drug-Induced Liver Injury. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182010603. [PMID: 34682349 PMCID: PMC8535865 DOI: 10.3390/ijerph182010603] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/28/2021] [Accepted: 10/05/2021] [Indexed: 12/28/2022]
Abstract
Drug-induced liver injury (DILI) is a major cause of drug development failure and drug withdrawal from the market after approval. The identification of human risk factors associated with susceptibility to DILI is of paramount importance. Increasing evidence suggests that genetic variants may lead to inter-individual differences in drug response; however, individual single-nucleotide polymorphisms (SNPs) usually have limited power to predict human phenotypes such as DILI. In this study, we aim to identify appropriate statistical methods to investigate gene-gene and/or gene-environment interactions that impact DILI susceptibility. Three machine learning approaches, including Multivariate Adaptive Regression Splines (MARS), Multifactor Dimensionality Reduction (MDR), and logistic regression, were used. The simulation study suggested that all three methods were robust and could identify the known SNP-SNP interaction when up to 4% of genotypes were randomly permutated. When applied to a real-life DILI chronicity dataset, both MARS and MDR, but not logistic regression, identified combined genetic variants having better associations with DILI chronicity in comparison to the use of individual SNPs. Furthermore, a simple decision tree model using the SNPs identified by MARS and MDR was developed to predict DILI chronicity, with fair performance. Our study suggests that machine learning approaches may help identify gene-gene interactions as potential risk factors for better assessing complicated diseases such as DILI chronicity.
Collapse
|
10
|
Park M, Jeong HB, Lee JH, Park T. Spatial rank-based multifactor dimensionality reduction to detect gene-gene interactions for multivariate phenotypes. BMC Bioinformatics 2021; 22:480. [PMID: 34607566 PMCID: PMC8489107 DOI: 10.1186/s12859-021-04395-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 09/17/2021] [Indexed: 01/11/2023] Open
Abstract
Background Identifying interaction effects between genes is one of the main tasks of genome-wide association studies aiming to shed light on the biological mechanisms underlying complex diseases. Multifactor dimensionality reduction (MDR) is a popular approach for detecting gene–gene interactions that has been extended in various forms to handle binary and continuous phenotypes. However, only few multivariate MDR methods are available for multiple related phenotypes. Current approaches use Hotelling’s T2 statistic to evaluate interaction models, but it is well known that Hotelling’s T2 statistic is highly sensitive to heavily skewed distributions and outliers. Results We propose a robust approach based on nonparametric statistics such as spatial signs and ranks. The new multivariate rank-based MDR (MR-MDR) is mainly suitable for analyzing multiple continuous phenotypes and is less sensitive to skewed distributions and outliers. MR-MDR utilizes fuzzy k-means clustering and classifies multi-locus genotypes into two groups. Then, MR-MDR calculates a spatial rank-sum statistic as an evaluation measure and selects the best interaction model with the largest statistic. Our novel idea lies in adopting nonparametric statistics as an evaluation measure for robust inference. We adopt tenfold cross-validation to avoid overfitting. Intensive simulation studies were conducted to compare the performance of MR-MDR with current methods. Application of MR-MDR to a real dataset from a Korean genome-wide association study demonstrated that it successfully identified genetic interactions associated with four phenotypes related to kidney function. The R code for conducting MR-MDR is available at https://github.com/statpark/MR-MDR. Conclusions Intensive simulation studies comparing MR-MDR with several current methods showed that the performance of MR-MDR was outstanding for skewed distributions. Additionally, for symmetric distributions, MR-MDR showed comparable power. Therefore, we conclude that MR-MDR is a useful multivariate non-parametric approach that can be used regardless of the phenotype distribution, the correlations between phenotypes, and sample size.
Collapse
Affiliation(s)
- Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon, 34824, Republic of Korea
| | - Hoe-Bin Jeong
- Department of Statistics, Korea University, Seoul, 02841, Republic of Korea
| | - Jong-Hyun Lee
- Department of Statistics, Korea University, Seoul, 02841, Republic of Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
11
|
Rana S, Sultana A, Bhatti AA. Effect of interaction between obesity-promoting genetic variants and behavioral factors on the risk of obese phenotypes. Mol Genet Genomics 2021; 296:919-938. [PMID: 33966103 DOI: 10.1007/s00438-021-01793-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 04/22/2021] [Indexed: 01/28/2023]
Abstract
The studies investigating gene-gene and gene-environment (or gene-behavior) interactions provide valuable insight into the pathomechanisms underlying obese phenotypes. The Pakistani population due to its unique characteristics offers numerous advantages for conducting such studies. In this view, the current study was undertaken to examine the effects of gene-gene and gene-environment/behavior interactions on the risk of obesity in a sample of Pakistani population. A total of 578 adult participants including 290 overweight/obese cases and 288 normal-weight controls were involved. The five key obesity-associated genetic variants namely MC4R rs17782313, BDNF rs6265, FTO rs1421085, TMEM18 rs7561317, and NEGR1 rs2815752 were genotyped using the TaqMan allelic discrimination assays. The data related to behavioral factors, such as eating pattern, diet consciousness, the tendency toward fat-dense food (TFDF), sleep duration, sleep-wake cycle (SWC), shift work (SW), and physical activity levels were collected via a questionnaire. Gene-gene and gene-behavior interactions were analyzed by multifactor dimensionality reduction and linear regression, respectively. In our study, only TMEM18 rs7561317 was found to be significantly associated with anthropometric traits with no significant effect of gene-gene interactions were observed on obesity-related phenotypes. However, the genetic variants were found to interact with the behavioral factors to significantly influence various obesity-related anthropometric traits including BMI, waist circumference, hip circumference, waist-to-hip ratio, waist-to-height ratio, and percentage of body fat. In conclusion, the interaction between genetic architecture and behavior/environment determines the outcome of obesity-related anthropometric phenotypes. Thus, gene-environment/behavior interaction studies should be promoted to explore the risk of complex and multifactorial disorders, such as obesity.
Collapse
Affiliation(s)
- Sobia Rana
- Molecular Biology and Human Genetics Laboratory, Dr. Panjwani Center for Molecular Medicine and Drug Research (PCMD), International Center for Chemical and Biological Sciences (ICCBS), University of Karachi, Karachi, 75270, Pakistan.
| | - Ayesha Sultana
- Molecular Biology and Human Genetics Laboratory, Dr. Panjwani Center for Molecular Medicine and Drug Research (PCMD), International Center for Chemical and Biological Sciences (ICCBS), University of Karachi, Karachi, 75270, Pakistan
| | - Adil Anwar Bhatti
- Molecular Biology and Human Genetics Laboratory, Dr. Panjwani Center for Molecular Medicine and Drug Research (PCMD), International Center for Chemical and Biological Sciences (ICCBS), University of Karachi, Karachi, 75270, Pakistan
| |
Collapse
|
12
|
Park M, Kim SA, Shin J, Joo EJ. Investigation of gene-gene interactions of clock genes for chronotype in a healthy Korean population. Genomics Inform 2021; 18:e38. [PMID: 33412754 PMCID: PMC7808872 DOI: 10.5808/gi.2020.18.4.e38] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 11/07/2020] [Indexed: 11/20/2022] Open
Abstract
Chronotype is an important moderator of psychiatric illnesses, which seems to be controlled in some part by genetic factors. Clock genes are the most relevant genes for chronotype. In addition to the roles of individual genes, gene-gene interactions of clock genes substantially contribute to chronotype. We investigated genetic associations and gene-gene interactions of the clock genes BHLHB2, CLOCK, CSNK1E, NR1D1, PER1, PER2, PER3, and TIMELESS for chronotype in 1293 healthy Korean individuals. Regression analysis was conducted to find associations between single nucleotide polymorphism (SNP) and chronotype. For gene-gene interaction analyses, the quantitative multifactor dimensionality reduction (QMDR) method, a nonparametric model-free method for quantitative phenotypes, were performed. No individual SNP or haplotype showed a significant association with chronotype by both regression analysis and single-locus model of QMDR. QMDR analysis identified NR1D1 rs2314339 and TIMELESS rs4630333 as the best SNP pairs among two-locus interaction models associated with chronotype (cross-validation consistency [CVC] = 8/10, p = 0.041). For the three-locus interaction model, the SNP combination of NR1D1 rs2314339, TIMELESS rs4630333, and PER3 rs228669 showed the best results (CVC = 4/10, p < 0.001). However, because the mean differences between genotype combinations were minor, the clinical roles of clock gene interactions are unlikely to be critical.
Collapse
Affiliation(s)
- Mira Park
- Department of Preventive Medicine, Eulji University School of Medicine, Daejeon 34824, Korea
| | - Soon Ae Kim
- Department of Pharmacology, Eulji University School of Medicine, Daejeon 34824, Korea
| | - Jieun Shin
- Department of Liberal Arts, Woosuk University, Wanju 55338, Korea
| | - Eun-Jeong Joo
- Department of Neuropsychiatry, Eulji University School of Medicine, Daejeon 34824, Korea.,Department of Psychiatry, Nowon Eulji Medical Center, Eulji University, Seoul 01830, Korea
| |
Collapse
|
13
|
Luyapan J, Ji X, Li S, Xiao X, Zhu D, Duell EJ, Christiani DC, Schabath MB, Arnold SM, Zienolddiny S, Brunnström H, Melander O, Thornquist MD, MacKenzie TA, Amos CI, Gui J. A new efficient method to detect genetic interactions for lung cancer GWAS. BMC Med Genomics 2020; 13:162. [PMID: 33126877 PMCID: PMC7596958 DOI: 10.1186/s12920-020-00807-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/11/2020] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have proven successful in predicting genetic risk of disease using single-locus models; however, identifying single nucleotide polymorphism (SNP) interactions at the genome-wide scale is limited due to computational and statistical challenges. We addressed the computational burden encountered when detecting SNP interactions for survival analysis, such as age of disease-onset. To confront this problem, we developed a novel algorithm, called the Efficient Survival Multifactor Dimensionality Reduction (ES-MDR) method, which used Martingale Residuals as the outcome parameter to estimate survival outcomes, and implemented the Quantitative Multifactor Dimensionality Reduction method to identify significant interactions associated with age of disease-onset. METHODS To demonstrate efficacy, we evaluated this method on two simulation data sets to estimate the type I error rate and power. Simulations showed that ES-MDR identified interactions using less computational workload and allowed for adjustment of covariates. We applied ES-MDR on the OncoArray-TRICL Consortium data with 14,935 cases and 12,787 controls for lung cancer (SNPs = 108,254) to search over all two-way interactions to identify genetic interactions associated with lung cancer age-of-onset. We tested the best model in an independent data set from the OncoArray-TRICL data. RESULTS Our experiment on the OncoArray-TRICL data identified many one-way and two-way models with a single-base deletion in the noncoding region of BRCA1 (HR 1.24, P = 3.15 × 10-15), as the top marker to predict age of lung cancer onset. CONCLUSIONS From the results of our extensive simulations and analysis of a large GWAS study, we demonstrated that our method is an efficient algorithm that identified genetic interactions to include in our models to predict survival outcomes.
Collapse
Affiliation(s)
- Jennifer Luyapan
- Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
| | - Xuemei Ji
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
| | - Siting Li
- Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
| | - Xiangjun Xiao
- Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Dakai Zhu
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
- Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric J Duell
- Unit of Nutrition and Cancer, Catalan Institute of Oncology (ICO-IDIBELL), 08908, Barcelona, Spain
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Boston, MA, 02115, USA
- Department of Medicine, Massachusetts General Hospital, Boston, MA, 02115, USA
| | - Matthew B Schabath
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Susanne M Arnold
- Markey Cancer Center, University of Kentucky, First Floor, 800 Rose Street, Lexington, KY, 40508, USA
| | - Shanbeh Zienolddiny
- National Institute of Occupational Health, 0033 Gydas vei 8, 0033, Oslo, Norway
| | - Hans Brunnström
- Laboratory Medicine Region Skåne, Department of Clinical Sciences Lund, Pathology, Lund University, Lund, Sweden
| | - Olle Melander
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Mark D Thornquist
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Todd A MacKenzie
- Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA
| | - Christopher I Amos
- Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA.
- Institute for Clinical and Translational Research, Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Jiang Gui
- Quantitative Biomedical Science Program, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, One Medical Center Dr., Lebanon, NH, 03756, USA.
| |
Collapse
|
14
|
Wen J, Ford CT, Janies D, Shi X. A parallelized strategy for epistasis analysis based on Empirical Bayesian Elastic Net models. Bioinformatics 2020; 36:3803-3810. [PMID: 32227194 DOI: 10.1093/bioinformatics/btaa216] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 03/05/2020] [Accepted: 03/26/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Epistasis reflects the distortion on a particular trait or phenotype resulting from the combinatorial effect of two or more genes or genetic variants. Epistasis is an important genetic foundation underlying quantitative traits in many organisms as well as in complex human diseases. However, there are two major barriers in identifying epistasis using large genomic datasets. One is that epistasis analysis will induce over-fitting of an over-saturated model with the high-dimensionality of a genomic dataset. Therefore, the problem of identifying epistasis demands efficient statistical methods. The second barrier comes from the intensive computing time for epistasis analysis, even when the appropriate model and data are specified. RESULTS In this study, we combine statistical techniques and computational techniques to scale up epistasis analysis using Empirical Bayesian Elastic Net (EBEN) models. Specifically, we first apply a matrix manipulation strategy for pre-computing the correlation matrix and pre-filter to narrow down the search space for epistasis analysis. We then develop a parallelized approach to further accelerate the modeling process. Our experiments on synthetic and empirical genomic data demonstrate that our parallelized methods offer tens of fold speed up in comparison with the classical EBEN method which runs in a sequential manner. We applied our parallelized approach to a yeast dataset, and we were able to identify both main and epistatic effects of genetic variants associated with traits such as fitness. AVAILABILITY AND IMPLEMENTATION The software is available at github.com/shilab/parEBEN.
Collapse
Affiliation(s)
- Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Colby T Ford
- Department of Bioinformatics and Genomics, College of Computing and Informatics.,School of Data Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Daniel Janies
- Department of Bioinformatics and Genomics, College of Computing and Informatics
| | - Xinghua Shi
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA
| |
Collapse
|
15
|
Gene-Gene Interaction Analysis for the Survival Phenotype Based on the Kaplan-Meier Median Estimate. BIOMED RESEARCH INTERNATIONAL 2020; 2020:5282345. [PMID: 32461998 PMCID: PMC7232685 DOI: 10.1155/2020/5282345] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 03/02/2020] [Accepted: 04/07/2020] [Indexed: 12/25/2022]
Abstract
In this study, we propose a simple and computationally efficient method based on the multifactor dimensional reduction algorithm to identify gene-gene interactions associated with the survival phenotype. The proposed method, referred to as KM-MDR, uses the Kaplan-Meier median survival time as a classifier. The KM-MDR method classifies multilocus genotypes into a binary attribute for high- or low-risk groups using median survival time and replaces balanced accuracy with log-rank test statistics as a score to determine the best model. Through intensive simulation studies, we compared the power of KM-MDR with that of Surv-MDR, Cox-MDR, and AFT-MDR. It was found that KM-MDR has a similar power to that of Surv-MDR, with less computing time, and has comparable power to that of Cox-MDR and AFT-MDR, even when there is a covariate effect. Furthermore, we apply KM-MDR to a real dataset of ovarian cancer patients from The Cancer Genome Atlas (TCGA).
Collapse
|
16
|
Yang CH, Chuang LY, Lin YD. An improved fuzzy set-based multifactor dimensionality reduction for detecting epistasis. Artif Intell Med 2020; 102:101768. [PMID: 31980105 DOI: 10.1016/j.artmed.2019.101768] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 10/18/2019] [Accepted: 11/19/2019] [Indexed: 01/07/2023]
Abstract
OBJECTIVE Epistasis identification is critical for determining susceptibility to human genetic diseases. The rapid development of technology has enabled scalability to make multifactor dimensionality reduction (MDR) measurements an effective calculation tool that achieves superior detection. However, the classification of high-risk (H) or low-risk (L) groups in multidrug resistance operations calls for extensive research. METHODS AND MATERIAL In this study, an improved fuzzy sigmoid (FS) method using the membership degree in MDR (FSMDR) was proposed for solving the limitations of binary classification. The FS method combined with MDR measurements yielded an improved ability to distinguish similar frequencies of potential multifactor genotypes. RESULTS We compared our results with other MDR-based methods and FSMDR achieved superior detection rates on simulated data sets. The results indicated that the fuzzy classifications can provide insight into the uncertainty of H/L classification in MDR operation. CONCLUSION FSMDR successfully detected significant epistasis of coronary artery disease in the Wellcome Trust Case Control Consortium data set.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, No. 415, Jiangong Rd., Sanmin Dist., Kaohsiung City, 80778, Taiwan; Ph. D. Program in Biomedical Engineering, Kaohsiung Medical University, No. 100, Shih-Chuan 1st Rd., Kaohsiung, 80708, Taiwan.
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, No.1, Sec. 1, Syuecheng Rd., Dashu District, Kaohsiung, 84001, Taiwan.
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, No. 415, Jiangong Rd., Sanmin Dist., Kaohsiung City, 80778, Taiwan.
| |
Collapse
|
17
|
Shen L, Thompson PM. Brain Imaging Genomics: Integrated Analysis and Machine Learning. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2020; 108:125-162. [PMID: 31902950 PMCID: PMC6941751 DOI: 10.1109/jproc.2019.2947272] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Brain imaging genomics is an emerging data science field, where integrated analysis of brain imaging and genomics data, often combined with other biomarker, clinical and environmental data, is performed to gain new insights into the phenotypic, genetic and molecular characteristics of the brain as well as their impact on normal and disordered brain function and behavior. It has enormous potential to contribute significantly to biomedical discoveries in brain science. Given the increasingly important role of statistical and machine learning in biomedicine and rapidly growing literature in brain imaging genomics, we provide an up-to-date and comprehensive review of statistical and machine learning methods for brain imaging genomics, as well as a practical discussion on method selection for various biomedical applications.
Collapse
Affiliation(s)
- Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark & Mary Stevens Institute for Neuroimaging & Informatics, Keck School of Medicine, University of Southern California, Los Angeles, CA 90232, USA
| |
Collapse
|
18
|
Maitra S, Chatterjee M, Sinha S, Mukhopadhyay K. Dopaminergic gene analysis indicates influence of inattention but not IQ in executive dysfunction of Indian ADHD probands. J Neurogenet 2019; 33:209-217. [PMID: 31663399 DOI: 10.1080/01677063.2019.1672679] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Organizational inefficiency and inattention are speculated to be the reason for executive deficit (ED) of ADHD probands. Even with average IQ, probands often perform poorly due to higher inattention. Pharmacotherapy, cognitive behavioural therapy, and counselling provide only symptomatic relief. Several candidate genes showed involvement with ADHD; the most consistent are dopamine receptor 4 (DRD4) and solute carrier family 6 member 3 (SLC6A3). We analyzed association of rarely investigated DRD4 and SLC6A3 variants with ADHD core traits in Indo-Caucasoid probands. ED, inattention, organizational efficiency, and IQ were measured by Barkley Deficit in Executive Functioning-Child & Adolescent scale, DSM-IV-TR, Conners' Parent Rating Scale-revised, and WISC respectively. Target sites were analyzed by PCR, RFLP, and/or Sanger sequencing of genomic DNA. DRD4 variants mostly affected inattention while SLC6A3 variants showed association with IQ. Few DRD4 and SLC6A3 variants showed dichotomous association with IQ and inattention. DRD4 Exon3 VNTR >4R showed negative impact on all traits excepting IQ. Inattention showed correlation with attention span, organizational efficiency, and ED, while IQ failed to do so. We infer that IQ and attention could be differentially regulated by dopaminergic gene variants affecting functional efficiency in ADHD and the two traits should be considered together for providing better rehabilitation.
Collapse
Affiliation(s)
- Subhamita Maitra
- Manovikas Biomedical Research and Diagnostic Centre, Kolkata, India.,Mahidol University, Institute of Molecular Biosciences, Thailand
| | | | - Swagata Sinha
- Manovikas Biomedical Research and Diagnostic Centre, Kolkata, India
| | | |
Collapse
|
19
|
Liu Y, Huang J, Urbanowicz RJ, Chen K, Manduchi E, Greene CS, Moore JH, Scheet P, Chen Y. Embracing study heterogeneity for finding genetic interactions in large-scale research consortia. Genet Epidemiol 2019; 44:52-66. [PMID: 31583758 DOI: 10.1002/gepi.22262] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 08/02/2019] [Accepted: 08/09/2019] [Indexed: 11/12/2022]
Abstract
Genetic interactions have been recognized as a potentially important contributor to the heritability of complex diseases. Nevertheless, due to small effect sizes and stringent multiple-testing correction, identifying genetic interactions in complex diseases is particularly challenging. To address the above challenges, many genomic research initiatives collaborate to form large-scale consortia and develop open access to enable sharing of genome-wide association study (GWAS) data. Despite the perceived benefits of data sharing from large consortia, a number of practical issues have arisen, such as privacy concerns on individual genomic information and heterogeneous data sources from distributed GWAS databases. In the context of large consortia, we demonstrate that the heterogeneously appearing marginal effects over distributed GWAS databases can offer new insights into genetic interactions for which conventional methods have had limited success. In this paper, we develop a novel two-stage testing procedure, named phylogenY-based effect-size tests for interactions using first 2 moments (YETI2), to detect genetic interactions through both pooled marginal effects, in terms of averaging site-specific marginal effects, and heterogeneity in marginal effects across sites, using a meta-analytic framework. YETI2 can not only be applied to large consortia without shared personal information but also can be used to leverage underlying heterogeneity in marginal effects to prioritize potential genetic interactions. We investigate the performance of YETI2 through simulation studies and apply YETI2 to bladder cancer data from dbGaP.
Collapse
Affiliation(s)
- Yulun Liu
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, Texas
| | - Jing Huang
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Ryan J Urbanowicz
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kun Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Casey S Greene
- Department of Pharmacology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Paul Scheet
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Yong Chen
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
20
|
Yang CH, Chuang LY, Lin YD. Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions. Bioinformatics 2019; 34:2228-2236. [PMID: 29471406 DOI: 10.1093/bioinformatics/bty076] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 02/16/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Single-nucleotide polymorphism (SNP)-SNP interactions (SSIs) are popular markers for understanding disease susceptibility. Multifactor dimensionality reduction (MDR) can successfully detect considerable SSIs. Currently, MDR-based methods mainly adopt a single-objective function (a single measure based on contingency tables) to detect SSIs. However, generally, a single-measure function might not yield favorable results due to potential model preferences and disease complexities. Approach This study proposes a multiobjective MDR (MOMDR) method that is based on a contingency table of MDR as an objective function. MOMDR considers the incorporated measures, including correct classification and likelihood rates, to detect SSIs and adopts set theory to predict the most favorable SSIs with cross-validation consistency. MOMDR enables simultaneously using multiple measures to determine potential SSIs. Results Three simulation studies were conducted to compare the detection success rates of MOMDR and single-objective MDR (SOMDR), revealing that MOMDR had higher detection success rates than SOMDR. Furthermore, the Wellcome Trust Case Control Consortium dataset was analyzed by MOMDR to detect SSIs associated with coronary artery disease. Availability and implementation: MOMDR is freely available at https://goo.gl/M8dpDg. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.,Graduate Institute of Clinical Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| |
Collapse
|
21
|
Lulińska-Kuklik E, Maculewicz E, Moska W, Ficek K, Kaczmarczyk M, Michałowska-Sawczyn M, Humińska-Lisowska K, Buryta M, Chycki J, Cięszczyk P, Żmijewski P, Rzeszutko A, Sawczuk M, Stastny P, Petr M, Maciejewska-Skrendo A. Are IL1B, IL6 and IL6R Gene Variants Associated with Anterior Cruciate Ligament Rupture Susceptibility? J Sports Sci Med 2019; 18:137-145. [PMID: 30787661 PMCID: PMC6370956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 01/16/2019] [Indexed: 06/09/2023]
Abstract
Cytokines, such as interleukins, are crucial in regulating critical cell signaling pathways as well as being major contributors to inflammatory response and are upregulated during ligament and tendon injuries. The genes encoding key interleukins, such as IL1B and IL6 as well as interleukin receptor IL6R, were chosen as candidate genes for association with soft tissue injuries. The aim of the case-control study was to verify the hypothesis that sequence variants rs1143627, rs16944, rs1800795, rs2228145 in the IL1B, IL6 and IL6R genes are associated with ACL rupture susceptibility in a Polish population. Among four analyzed SNPs, the rs1800795 IL6 gene polymorphism was found to be the only one significantly associated with ACL rupture (p = 0.010, p = 0.022, p = 0.004 for codominant, recessive and overdominant models, respectively; odds ratio = 1.74, 95% CI 1.08-2.81, sex adjusted p = 0.032 for recessive model). With reference to the other analyzed polymorphisms, we failed to show significant differences in the genotype and allele frequencies for IL6R rs2228145as well as IL1B rs16944 and rs1143627 (analyzed alone or in haplotype combination) between the ACL rupture group and the healthy control group among Polish participants. Due to the nature of case-control studies, the results of this study need to be confirmed in independent studies with larger sample sizes.
Collapse
Affiliation(s)
- Ewelina Lulińska-Kuklik
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | - Ewelina Maculewicz
- Department of Applied Physiology, Military Institute of Hygiene and Epidemiology, Warsaw, Poland
| | - Waldemar Moska
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | - Krzysztof Ficek
- Faculty of Physiotherapy, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Mariusz Kaczmarczyk
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | | | - Kinga Humińska-Lisowska
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | - Maciej Buryta
- Faculty of Physical Education, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | - Jakub Chycki
- Faculty of Physical Education, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Pawel Cięszczyk
- Faculty of Physical Education, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland
| | - Piotr Żmijewski
- Faculty of Medicine, University of Information Technology and Management in Rzeszow, Poland
| | - Agata Rzeszutko
- Faculty of Physical Education, University of Rzeszow, Rzeszow, Poland
| | - Marek Sawczuk
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Gdansk, Poland
| | - Petr Stastny
- Department of Sport Games, Charles University in Prague, Prague, Czech Republic
| | - Miroslav Petr
- Department of Sport Games, Charles University in Prague, Prague, Czech Republic
| | | |
Collapse
|
22
|
Abstract
Identifying gene-gene and gene-environment interactions may help us to better describe the genetic architecture for complex traits. While advances have been made in identifying genetic variants associated with complex traits through more dense panels of genetic variants and larger sample sizes, genome-wide interaction analyses are still limited in power to detect interactions with small effect sizes, rare frequencies, and higher order interactions. This chapter outlines methods for detecting both gene-gene and gene-environment interactions both through explicit tests for interactions (i.e., ones in which the interaction is tested directly) and non-explicit tests (i.e., ones in which an interaction is allowed for in the test, but does not test for the interaction directly) as well as approaches for increasing power by reducing the search space. Issues relating to multiple test correction, replication, and the reporting of interaction results in publications.
Collapse
Affiliation(s)
- Andrew T DeWan
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, USA.
| |
Collapse
|
23
|
Lulińska-Kuklik E, Rahim M, Moska W, Maculewicz E, Kaczmarczyk M, Maciejewska-Skrendo A, Ficek K, Cieszczyk P, September AV, Sawczuk M. Are MMP3, MMP8 and TIMP2 gene variants associated with anterior cruciate ligament rupture susceptibility? J Sci Med Sport 2019; 22:753-757. [PMID: 30755371 DOI: 10.1016/j.jsams.2019.01.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 01/18/2019] [Accepted: 01/22/2019] [Indexed: 10/27/2022]
Abstract
OBJECTIVES Anterior cruciate ligament rupture (ACLR) is a common and severe knee injury which typically occurs as a result of sports participation, primarily via a non-contact mechanism. A number of extrinsic and intrinsic risk factors, including genetics, have been identified thus far. Matrix metalloproteinases (MMPs) and tissue inhibitors of metalloproteases (TIMPs) play a crucial role in extracellular matrix remodeling of ligaments and therefore the genes encoding MMPs and TIMPs are plausible candidates for investigation with ACL rupture risk. DESIGN A case-control genetic association study was conducted on 229 (158 male) individuals with surgically diagnosed primary ACLR, ruptured through non-contact mechanisms and 192 (107 male) apparently healthy participants (CON) without any history of ACLR. All participants were physically active, unrelated, self-reported Caucasians. METHODS All participants were genotyped for four single nucleotide polymorphisms (SNP): MMP3 (rs591058C/T, rs679620 G/A), MMP8 (rs11225395C/T), and TIMP2 (rs4789932 G/A) using standard PCR assays. Gene-gene interactions were inferred. Single-locus association analysis was conducted using the Chi-square test. SNP-SNP interaction effects were analysed using multifactor dimensionality reduction (MDR) method. RESULTS Genotype frequencies did not significantly differ between cases and controls, however, the MMP3 rs679620 G and rs591058C alleles were significantly overrepresented in cases compared to controls (p=0.021, OR=1.38, 95% CI: 1.05-1.81). CONCLUSIONS These results support the hypothesis that genetic variation within MMP3 contributes to inter-individual susceptibility to non-contact ACLR. However, these results need to be explored further in larger, independent sample sets.
Collapse
Affiliation(s)
- Ewelina Lulińska-Kuklik
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Poland
| | - Masouda Rahim
- Division of Exercise Science and Sports Medicine, Department of Human Biology, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Waldemar Moska
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Poland
| | - Ewelina Maculewicz
- Applied Physiology Unit, Military Institute of Hygiene and Epidemiology, Poland
| | - Mariusz Kaczmarczyk
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Poland
| | | | - Krzysztof Ficek
- Faculty of Physiotherapy, The Jerzy Kukuczka Academy of Physical Education in Katowice, Poland
| | - Pawel Cieszczyk
- Applied Physiology Unit, Military Institute of Hygiene and Epidemiology, Poland; Faculty of Physical Education, Gdansk University of Physical Education and Sport, Poland
| | - Alison V September
- Division of Exercise Science and Sports Medicine, Department of Human Biology, Faculty of Health Sciences, University of Cape Town, South Africa.
| | - Marek Sawczuk
- Faculty of Tourism and Recreation, Gdansk University of Physical Education and Sport, Poland
| |
Collapse
|
24
|
Amosco MD, Tavera GR, Villar VAM, Naniong JMA, David-Bustamante LMG, Williams SM, Jose PA, Palmes-Saloma CP. Non-additive effects of ACVR2A in preeclampsia in a Philippine population. BMC Pregnancy Childbirth 2019; 19:11. [PMID: 30621627 PMCID: PMC6323705 DOI: 10.1186/s12884-018-2152-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 12/17/2018] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Multiple interrelated pathways contribute to the pathogenesis of preeclampsia, and variants in susceptibility genes may play a role among Filipinos, an ethnically distinct group with high prevalence of the disease. The objective of this study was to examine the association between variants in maternal candidate genes and the development of preeclampsia in a Philippine population. METHODS A case-control study involving 29 single nucleotide polymorphisms (SNPs) in 21 candidate genes was conducted in 150 patients with preeclampsia (cases) and 175 women with uncomplicated normal pregnancies (controls). Genotyping for the GRK4 and DRD1 gene variants was carried out using the TaqMan Assay, and all other variants were assayed using the Sequenom MassARRAY Iplex Platform. PLINK was used for SNP association testing. Multilocus association analysis was performed using multifactor dimensionality reduction (MDR) analysis. RESULTS Among the clinical factors, older age (P < 1 × 10-4), higher BMI (P < 1 × 10-4), having a new partner (P = 0.006), and increased time interval from previous pregnancy (P = 0.018) associated with preeclampsia. The MDR algorithm identified the genetic variant ACVR2A rs1014064 as interacting with age and BMI in association with preeclampsia among Filipino women. CONCLUSIONS The MDR algorithm identified an interaction between age, BMI and ACVR2A rs1014064, indicating that context among genetic variants and demographic/clinical factors may be crucial to understanding the pathogenesis of preeclampsia among Filipino women.
Collapse
Affiliation(s)
- Melissa D. Amosco
- National Institute of Molecular Biology and Biotechnology, National Science Complex, University of the Philippines, Diliman, 1101 Quezon City, Philippines
- Department of Obstetrics and Gynecology, Philippine General Hospital - University of the Philippines, Taft Avenue, 1000 Manila, Philippines
| | - Gloria R. Tavera
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, School of Medicine, Cleveland, OH 44106 USA
| | - Van Anthony M. Villar
- Division of Renal Diseases & Hypertension, Department of Medicine, The George Washington University of School of Medicine & Health Sciences, Washington, DC, 20037 USA
| | - Justin Michael A. Naniong
- National Institute of Molecular Biology and Biotechnology, National Science Complex, University of the Philippines, Diliman, 1101 Quezon City, Philippines
| | - Lara Marie G. David-Bustamante
- Department of Obstetrics and Gynecology, Philippine General Hospital - University of the Philippines, Taft Avenue, 1000 Manila, Philippines
| | - Scott M. Williams
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, School of Medicine, Cleveland, OH 44106 USA
| | - Pedro A. Jose
- Division of Renal Diseases & Hypertension, Department of Medicine, The George Washington University of School of Medicine & Health Sciences, Washington, DC, 20037 USA
- Department of Pharmacology and Physiology, The George Washington University of School of Medicine & Health Sciences, Washington, DC, 20037 USA
| | - Cynthia P. Palmes-Saloma
- National Institute of Molecular Biology and Biotechnology, National Science Complex, University of the Philippines, Diliman, 1101 Quezon City, Philippines
- Philippine Genome Center, National Science Complex, University of the Philippines, Diliman, 1101 Quezon City, Philippines
| |
Collapse
|
25
|
Lee S, Son D, Kim Y, Yu W, Park T. Unified Cox model based multifactor dimensionality reduction method for gene-gene interaction analysis of the survival phenotype. BioData Min 2018; 11:27. [PMID: 30564286 PMCID: PMC6295107 DOI: 10.1186/s13040-018-0189-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 11/26/2018] [Indexed: 12/04/2022] Open
Abstract
Background One strategy for addressing missing heritability in genome-wide association study is gene-gene interaction analysis, which, unlike a single gene approach, involves high-dimensionality. The multifactor dimensionality reduction method (MDR) has been widely applied to reduce multi-levels of genotypes into high or low risk groups. The Cox-MDR method has been proposed to detect gene-gene interactions associated with the survival phenotype by using the martingale residuals from a Cox model. However, this method requires a cross-validation procedure to find the best SNP pair among all possible pairs and the permutation procedure should be followed for the significance of gene-gene interactions. Recently, the unified model based multifactor dimensionality reduction method (UM-MDR) has been proposed to unify the significance testing with the MDR algorithm within the regression model framework, in which neither cross-validation nor permutation testing are needed. In this paper, we proposed a simple approach, called Cox UM-MDR, which combines Cox-MDR with the key procedure of UM-MDR to identify gene-gene interactions associated with the survival phenotype. Results The simulation study was performed to compare Cox UM-MDR with Cox-MDR with and without the marginal effects of SNPs. We found that Cox UM-MDR has similar power to Cox-MDR without marginal effects, whereas it outperforms Cox-MDR with marginal effects and more robust to heavy censoring. We also applied Cox UM-MDR to a dataset of leukemia patients and detected gene-gene interactions with regard to the survival time. Conclusion Cox UM-MDR is easily implemented by combining Cox-MDR with UM-MDR to detect the significant gene-gene interactions associated with the survival time without cross-validation and permutation testing. The simulation results are shown to demonstrate the utility of the proposed method, which achieves at least the same power as Cox-MDR in most scenarios, and outperforms Cox-MDR when some SNPs having only marginal effects might mask the detection of the causal epistasis.
Collapse
Affiliation(s)
- Seungyeoun Lee
- 1Department of Mathematics and Statistics, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul, 05006 South Korea
| | - Donghee Son
- 1Department of Mathematics and Statistics, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul, 05006 South Korea
| | - Yongkang Kim
- 2Department of Statistics, Seoul National University, Shilim-dong, Kwanak-gu, Seoul, 151-742 South Korea
| | - Wenbao Yu
- 3Division of Oncology and Centre for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104 USA
| | - Taesung Park
- 2Department of Statistics, Seoul National University, Shilim-dong, Kwanak-gu, Seoul, 151-742 South Korea
| |
Collapse
|
26
|
Zhou X, Chan KCC. Detecting gene-gene interactions for complex quantitative traits using generalized fuzzy classification. BMC Bioinformatics 2018; 19:329. [PMID: 30227829 PMCID: PMC6145205 DOI: 10.1186/s12859-018-2361-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 09/09/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Quantitative traits or continuous outcomes related to complex diseases can provide more information and therefore more accurate analysis for identifying gene-gene and gene- environment interactions associated with complex diseases. Multifactor Dimensionality Reduction (MDR) is originally proposed to identify gene-gene and gene- environment interactions associated with binary status of complex diseases. Some efforts have been made to extend it to quantitative traits (QTs) and ordinal traits. However these and other methods are still not computationally efficient or effective. RESULTS Generalized Fuzzy Quantitative trait MDR (GFQMDR) is proposed in this paper to strengthen identification of gene-gene interactions associated with a quantitative trait by first transforming it to an ordinal trait and then selecting best sets of genetic markers, mainly single nucleotide polymorphisms (SNPs) or simple sequence length polymorphic markers (SSLPs), as having strong association with the trait through generalized fuzzy classification using extended member functions. Experimental results on simulated datasets and real datasets show that our algorithm has better success rate, classification accuracy and consistency in identifying gene-gene interactions associated with QTs. CONCLUSION The proposed algorithm provides a more effective way to identify gene-gene interactions associated with quantitative traits.
Collapse
Affiliation(s)
- Xiangdong Zhou
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou, Fujian China
| | - Keith C. C. Chan
- Department of Computing, the Hong Kong Polytechnic University, Kowloon, Hong Kong China
| |
Collapse
|
27
|
Gronek P, Gronek J, Lulińska-Kuklik E, Spieszny M, Niewczas M, Kaczmarczyk M, Petr M, Fischerova P, Ahmetov II, Żmijewski P. Polygenic Study of Endurance-Associated Genetic Markers NOS3 (Glu298Asp), BDKRB2 (-9/+9), UCP2 (Ala55Val), AMPD1 (Gln45Ter) and ACE (I/D) in Polish Male Half Marathoners. J Hum Kinet 2018; 64:87-98. [PMID: 30429902 PMCID: PMC6231335 DOI: 10.1515/hukin-2017-0204] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The purpose of this study was to investigate individually and in combination the association between the ACE (I/D), NOS3 (Glu298Asp), BDKRB2 (-9/+9), UCP2 (Ala55Val) and AMPD1 (Gln45Ter) variants with endurance performance in a large, performance-homogenous cohort of elite Polish half marathoners. The study group consisted of 180 elite half marathoners: 76 with time < 100 minutes and 104 with time > 100 minutes. DNA of the subjects was extracted from buccal cells donated by the runners and genotyping was carried out using an allelic discrimination assay with a C1000 Touch Thermal Cycler (Bio-Rad, Germany) instrument with TaqMan® probes (NOS3, UCP2, and AMPD1) and a T100™ Thermal Cycler (Bio-Rad, Germany) instrument (ACE and BDKRB2). We found that the UCP2 Ala55Val polymorphism was associated with running performance, with the subjects carrying the Val allele being overrepresented in the group of most successful runners (<100 min) compared to the >100 min group (84.2 vs. 55.8%; OR = 4.23, p < 0.0001). Next, to assess the combined impact of 4 gene polymorphisms, all athletes were classified according to the number of 'endurance' alleles (ACE I, NOS3 Glu, BDKRB2 -9, UCP2 Val) they possessed. The proportion of subjects with a high (4-7) number of 'endurance' alleles was greater in the better half marathoners group compared with the >100 min group (73.7 vs. 51.9%; OR = 2.6, p = 0.0034). These data suggest that the likelihood of becoming an elite half marathoner partly depends on the carriage of a high number of endurance-related alleles.
Collapse
Affiliation(s)
- Piotr Gronek
- Laboratory of Genetics, Department of Gymnastics and Dance, University School of Physical Education in Poznań, Poznań, Poland
| | - Joanna Gronek
- Laboratory of Genetics, Department of Gymnastics and Dance, University School of Physical Education in Poznań, Poznań, Poland
| | - Ewelina Lulińska-Kuklik
- Department of Tourism and Recreation, University of Physical Education and Sport, Gdańsk, Poland
| | - Michał Spieszny
- Institute of Sports, Faculty of Physical Education and Sports, University of Physical Education, Krakow, Poland
| | - Marta Niewczas
- Faculty of Physical Education University of Rzeszów, RzeszówPoland
| | - Mariusz Kaczmarczyk
- Department of Tourism and Recreation, University of Physical Education and Sport, Gdańsk, Poland
| | - Miroslav Petr
- Department of Sport Games, Charles University in Prague, Prague, Czech Republic
| | - Patricia Fischerova
- Department of Methodology, Statistics and Informatics, J.Kukuczka Academy of Physical Education in Katowice, KatowicePoland
| | - Ildus I. Ahmetov
- Laboratory of Molecular Genetics, Kazan State Medical University, Kazan, Russia
| | - Piotr Żmijewski
- Faculty of Medicine, University of Information Technology and Management in Rzeszow, Rzeszow, Poland
| |
Collapse
|
28
|
Leońska-Duniec A, Jastrzębski Z, Jażdżewska A, Krzysztof F, Cięszczyk P. Leptin and Leptin Receptor Genes Are Associated With Obesity-Related Traits Changes in Response to Aerobic Training Program. J Strength Cond Res 2018; 32:1036-1044. [PMID: 29373433 DOI: 10.1519/jsc.0000000000002447] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Leońska-Duniec, A, Jastrzębski, Z, Jażdżewska, A, Krzysztof, F, and Cięszczyk, P. Leptin and leptin receptor genes are associated with obesity-related traits changes in response to aerobic training program. J Strength Cond Res 32(4): 1036-1044, 2018-Leptin (LEP) and leptin receptor (LEPR) genes have been studied for their potential association with development of human obesity and its related complications. Therefore, we have decided to check whether selected body mass, body composition, and metabolic variables observed in physically active participants will be modulated by the polymorphisms. The genotype distribution was examined in a group of 201 women measured for chosen traits before and after the completion of a 12-week aerobic training program. Our results revealed a significant interaction between training and LEP genotype for glucose level. A training-related decrease in plasma glucose concentration in the LEP AG heterozygotes differed significantly from the change in the homozygotes. The polymorphism was also associated with fat-free mass (FFM), total body water (TBW), total cholesterol, triglycerides, and low-density lipoprotein cholesterol (LDL-C) levels. Another finding was a significant interaction between training and LEPR for LDL-C level. As opposed to AG and GG, AA homozygotes demonstrated a training-related decrease in LDL-C level. Our findings also showed that the LEPR G allele is connected with obesity-related traits. The participants with the GG genotype had higher body mass, body mass index (BMI), FFM, and TBW during the entire study period. This study provides evidence that polymorphisms in the LEP and LEPR genes are associated with the magnitude of the effects of regular physical activity on glucose and LDL-C levels, respectively. In addition, we found the association of the G allele of the LEPR polymorphism with body mass and BMI.
Collapse
Affiliation(s)
- Agata Leońska-Duniec
- Faculty of Physical Culture and Health Promotion, Department of Biological Basics of Physical Culture, University of Szczecin, Szczecin, Poland.,Faculty of Tourism and Recreation, Department of Health Promotion, Gdansk University of Physical Education and Sport, Gdańsk, Poland
| | - Zbigniew Jastrzębski
- Faculty of Tourism and Recreation, Department of Health Promotion, Gdansk University of Physical Education and Sport, Gdańsk, Poland
| | - Aleksandra Jażdżewska
- Faculty of Tourism and Recreation, Department of Health Promotion, Gdansk University of Physical Education and Sport, Gdańsk, Poland
| | - Ficek Krzysztof
- Faculty of Physiotherapy, Department of Physiotherapy Basics, The Jerzy Kukuczka Academy of Physical Education in Katowice, Katowice, Poland.,Galen-Orthopaedics, Bierun, Poland
| | - Paweł Cięszczyk
- Faculty of Physical Education, Department of Natural Sciences, Gdansk University of Physical Education and Sport, Gdańsk, Poland
| |
Collapse
|
29
|
Yang CH, Lin YD, Chuang LY. Multiple-Criteria Decision Analysis-Based Multifactor Dimensionality Reduction for Detecting Gene-Gene Interactions. IEEE J Biomed Health Inform 2018; 23:416-426. [PMID: 29993963 DOI: 10.1109/jbhi.2018.2790951] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Gene-gene interactions (GGIs) are important markers for determining susceptibility to a disease. Multifactor dimensionality reduction (MDR) is a popular algorithm for detecting GGIs and primarily adopts the correct classification rate (CCR) to assess the quality of a GGI. However, CCR measurement alone may not successfully detect certain GGIs because of potential model preferences and disease complexities. In this study, multiple-criteria decision analysis (MCDA) based on MDR was named MCDA-MDR and proposed for detecting GGIs. MCDA facilitates MDR to simultaneously adopt multiple measures within the two-way contingency table of MDR to assess GGIs; the CCR and rule utility measure were employed. Cross-validation consistency was adopted to determine the most favorable GGIs among the Pareto sets. Simulation studies were conducted to compare the detection success rates of the MDR-only-based measure and MCDA-MDR, revealing that MCDA-MDR had superior detection success rates. The Wellcome Trust Case Control Consortium dataset was analyzed using MCDA-MDR to detect GGIs associated with coronary artery disease, and MCDA-MDR successfully detected numerous significant GGIs (p < 0.001). MCDA-MDR performance assessment revealed that the applied MCDA successfully enhanced the GGI detection success rate of the MDR-based method compared with MDR alone.
Collapse
|
30
|
Jung HY, Leem S, Park T. Fuzzy set-based generalized multifactor dimensionality reduction analysis of gene-gene interactions. BMC Med Genomics 2018; 11:32. [PMID: 29697366 PMCID: PMC5918459 DOI: 10.1186/s12920-018-0343-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Gene-gene interactions (GGIs) are a known cause of missing heritability. Multifactor dimensionality reduction (MDR) is one of most commonly used methods for GGI detection. The generalized multifactor dimensionality reduction (GMDR) method is an extension of MDR method that is applicable to various types of traits, and allows covariate adjustments. Our previous Fuzzy MDR (FMDR) is another extension for overcoming simple binary classification. FMDR uses continuous member-ship values instead of binary membership values 0 and 1, improving power for detecting causal SNPs and more intuitive interpretations in real data analysis. Here, we propose the fuzzy generalized multifactor dimensionality reduction (FGMDR) method, as a combined analysis of fuzzy set-based analysis and GMDR method, to detect GGIs associated with diseases using fuzzy set theory. RESULTS Through simulation studies for different types of traits, the proposed FGMDR showed a higher detection ratio of causal SNPs, compared to GMDR. We then applied FGMDR to two real data: Crohn's disease (CD) data from the Wellcome Trust Case Control Consortium (WTCCC) with a binary phenotype and the Homeostasis Model Assessment of Insulin Resistance (HOMA-IR) data from Korean population with a continuous phenotype. The interactions derived by our method include the pre-reported interactions associated with phenotypes. CONCLUSIONS The proposed FGMDR performs well for GGI detection with covariate adjustments. The program written in R for FGMDR is available at http://statgen.snu.ac.kr/software/FGMDR .
Collapse
Affiliation(s)
- Hye-Young Jung
- Faculty of Liberal Education, Seoul National University, Seoul, 08826 South Korea
| | - Sangseob Leem
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| |
Collapse
|
31
|
Yu W, Lee S, Park T. A unified model based multifactor dimensionality reduction framework for detecting gene-gene interactions. Bioinformatics 2017; 32:i605-i610. [PMID: 27587680 DOI: 10.1093/bioinformatics/btw424] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Gene-gene interaction (GGI) is one of the most popular approaches for finding and explaining the missing heritability of common complex traits in genome-wide association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGI effects. However, there are several disadvantages of the existing MDR-based approaches, such as the lack of an efficient way of evaluating the significance of multi-locus models and the high computational burden due to intensive permutation. Furthermore, the MDR method does not distinguish marginal effects from pure interaction effects. METHODS We propose a two-step unified model based MDR approach (UM-MDR), in which, the significance of a multi-locus model, even a high-order model, can be easily obtained through a regression framework with a semi-parametric correction procedure for controlling Type I error rates. In comparison to the conventional permutation approach, the proposed semi-parametric correction procedure avoids heavy computation in order to achieve the significance of a multi-locus model. The proposed UM-MDR approach is flexible in the sense that it is able to incorporate different types of traits and evaluate significances of the existing MDR extensions. RESULTS The simulation studies and the analysis of a real example are provided to demonstrate the utility of the proposed method. UM-MDR can achieve at least the same power as MDR for most scenarios, and it outperforms MDR especially when there are some single nucleotide polymorphisms that only have marginal effects, which masks the detection of causal epistasis for the existing MDR approaches. CONCLUSIONS UM-MDR provides a very good supplement of existing MDR method due to its efficiency in achieving significance for every multi-locus model, its power and its flexibility of handling different types of traits. AVAILABILITY AND IMPLEMENTATION A R package "umMDR" and other source codes are freely available at http://statgen.snu.ac.kr/software/umMDR/ CONTACT: tspark@stats.snu.ac.kr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenbao Yu
- Department of Statistics, Seoul National University, Shilim-Dong, Kwanak-Gu, Seoul 151-742, Korea
| | - Seungyeoun Lee
- Department of Mathematics and Statistics, Sejong University, Seoul 143-747, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Shilim-Dong, Kwanak-Gu, Seoul 151-742, Korea
| |
Collapse
|
32
|
Moore JH, Andrews PC, Olson RS, Carlson SE, Larock CR, Bulhoes MJ, O'Connor JP, Greytak EM, Armentrout SL. Grid-based stochastic search for hierarchical gene-gene interactions in population-based genetic studies of common human diseases. BioData Min 2017; 10:19. [PMID: 28572842 PMCID: PMC5450417 DOI: 10.1186/s13040-017-0139-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 05/18/2017] [Indexed: 11/18/2022] Open
Abstract
Background Large-scale genetic studies of common human diseases have focused almost exclusively on the independent main effects of single-nucleotide polymorphisms (SNPs) on disease susceptibility. These studies have had some success, but much of the genetic architecture of common disease remains unexplained. Attention is now turning to detecting SNPs that impact disease susceptibility in the context of other genetic factors and environmental exposures. These context-dependent genetic effects can manifest themselves as non-additive interactions, which are more challenging to model using parametric statistical approaches. The dimensionality that results from a multitude of genotype combinations, which results from considering many SNPs simultaneously, renders these approaches underpowered. We previously developed the multifactor dimensionality reduction (MDR) approach as a nonparametric and genetic model-free machine learning alternative. Approaches such as MDR can improve the power to detect gene-gene interactions but are limited in their ability to exhaustively consider SNP combinations in genome-wide association studies (GWAS), due to the combinatorial explosion of the search space. We introduce here a stochastic search algorithm called Crush for the application of MDR to modeling high-order gene-gene interactions in genome-wide data. The Crush-MDR approach uses expert knowledge to guide probabilistic searches within a framework that capitalizes on the use of biological knowledge to filter gene sets prior to analysis. Here we evaluated the ability of Crush-MDR to detect hierarchical sets of interacting SNPs using a biology-based simulation strategy that assumes non-additive interactions within genes and additivity in genetic effects between sets of genes within a biochemical pathway. Results We show that Crush-MDR is able to identify genetic effects at the gene or pathway level significantly better than a baseline random search with the same number of model evaluations. We then applied the same methodology to a GWAS for Alzheimer’s disease and showed base level validation that Crush-MDR was able to identify a set of interacting genes with biological ties to Alzheimer’s disease. Conclusions We discuss the role of stochastic search and cloud computing for detecting complex genetic effects in genome-wide data.
Collapse
Affiliation(s)
- Jason H Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | - Peter C Andrews
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | - Randal S Olson
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | | | | | | | | | | | | |
Collapse
|
33
|
Ciesielski TH, Aldrich MC, Marsit CJ, Hiatt RA, Williams SM. Transdisciplinary approaches enhance the production of translational knowledge. Transl Res 2017; 182:123-134. [PMID: 27893987 PMCID: PMC5362296 DOI: 10.1016/j.trsl.2016.11.002] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 10/12/2016] [Accepted: 11/02/2016] [Indexed: 12/28/2022]
Abstract
The primary goal of translational research is to generate and apply knowledge that can improve human health. Although research conducted within the confines of a single discipline has helped us to achieve this goal in many settings, this unidisciplinary approach may not be optimal when disease causation is complex and health decisions are pressing. To address these issues, we suggest that transdisciplinary approaches can facilitate the progress of translational research, and we review publications that demonstrate what these approaches can look like. These examples serve to (1) demonstrate why transdisciplinary research is useful, and (2) stimulate a conversation about how it can be further promoted. While we note that open-minded communication is a prerequisite for germinating any transdisciplinary work and that epidemiologists can play a key role in promoting it, we do not propose a rigid protocol for conducting transdisciplinary research, as one really does not exist. These achievements were developed in settings where typical disciplinary and institutional barriers were surmountable, but they were not accomplished with a single predetermined plan. The benefits of cross-disciplinary communication are hard to predict a priori and a detailed research protocol or process may impede the realization of novel and important insights. Overall, these examples demonstrate that enhanced cross-disciplinary information exchange can serve as a starting point that helps researchers frame better questions, integrate more relevant evidence, and advance translational knowledge more effectively. Specifically, we discuss examples where transdisciplinary approaches are helping us to better explore, assess, and intervene to improve human health.
Collapse
Affiliation(s)
- Timothy H Ciesielski
- Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH; Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, NH; Public Health Program, Regis College, Weston, Mass.
| | - Melinda C Aldrich
- Department of Thoracic Surgery, Vanderbilt University Medical Center, Nashville, Tenn; Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tenn
| | - Carmen J Marsit
- Department of Pharmacology and Toxicology, Geisel School of Medicine at Dartmouth, Hanover, NH; Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH
| | - Robert A Hiatt
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, Calif
| | - Scott M Williams
- Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH; Department of Genetics, Geisel School of Medicine at Dartmouth, Hanover, NH
| |
Collapse
|
34
|
Abstract
BACKGROUND Detection of gene-gene interaction (GGI) is a key challenge towards solving the problem of missing heritability in genetics. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. MDR reduces the dimensionality of multi-factor by means of binary classification into high-risk (H) or low-risk (L) groups. Unfortunately, this simple binary classification does not reflect the uncertainty of H/L classification. Thus, we proposed Fuzzy MDR to overcome limitations of binary classification by introducing the degree of membership of two fuzzy sets H/L. While Fuzzy MDR demonstrated higher power than that of MDR, its performance is highly dependent on the several tuning parameters. In real applications, it is not easy to choose appropriate tuning parameter values. RESULT In this work, we propose an empirical fuzzy MDR (EF-MDR) which does not require specifying tuning parameters values. Here, we propose an empirical approach to estimating the membership degree that can be directly estimated from the data. In EF-MDR, the membership degree is estimated by the maximum likelihood estimator of the proportion of cases(controls) in each genotype combination. We also show that the balanced accuracy measure derived from this new membership function is a linear function of the standard chi-square statistics. This relationship allows us to perform the standard significance test using p-values in the MDR framework without permutation. Through two simulation studies, the power of the proposed EF-MDR is shown to be higher than those of MDR and Fuzzy MDR. We illustrate the proposed EF-MDR by analyzing Crohn's disease (CD) and bipolar disorder (BD) in the Wellcome Trust Case Control Consortium (WTCCC) dataset. CONCLUSION We propose an empirical Fuzzy MDR for detecting GGI using the maximum likelihood of the proportion of cases(controls) as the membership degree of the genotype combination. The program written in R for EF-MDR is available at http://statgen.snu.ac.kr/software/EF-MDR .
Collapse
Affiliation(s)
- Sangseob Leem
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, 08826 South Korea
| |
Collapse
|
35
|
Identifying gene-gene interactions that are highly associated with four quantitative lipid traits across multiple cohorts. Hum Genet 2016; 136:165-178. [PMID: 27848076 DOI: 10.1007/s00439-016-1738-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 10/07/2016] [Indexed: 10/20/2022]
Abstract
Genetic loci explain only 25-30 % of the heritability observed in plasma lipid traits. Epistasis, or gene-gene interactions may contribute to a portion of this missing heritability. Using the genetic data from five NHLBI cohorts of 24,837 individuals, we combined the use of the quantitative multifactor dimensionality reduction (QMDR) algorithm with two SNP-filtering methods to exhaustively search for SNP-SNP interactions that are associated with HDL cholesterol (HDL-C), LDL cholesterol (LDL-C), total cholesterol (TC) and triglycerides (TG). SNPs were filtered either on the strength of their independent effects (main effect filter) or the prior knowledge supporting a given interaction (Biofilter). After the main effect filter, QMDR identified 20 SNP-SNP models associated with HDL-C, 6 associated with LDL-C, 3 associated with TC, and 10 associated with TG (permutation P value <0.05). With the use of Biofilter, we identified 2 SNP-SNP models associated with HDL-C, 3 associated with LDL-C, 1 associated with TC and 8 associated with TG (permutation P value <0.05). In an independent dataset of 7502 individuals from the eMERGE network, we replicated 14 of the interactions identified after main effect filtering: 11 for HDL-C, 1 for LDL-C and 2 for TG. We also replicated 23 of the interactions found to be associated with TG after applying Biofilter. Prior knowledge supports the possible role of these interactions in the genetic etiology of lipid traits. This study also presents a computationally efficient pipeline for analyzing data from large genotyping arrays and detecting SNP-SNP interactions that are not primarily driven by strong main effects.
Collapse
|
36
|
Ayati M, Koyutürk M. PoCos: Population Covering Locus Sets for Risk Assessment in Complex Diseases. PLoS Comput Biol 2016; 12:e1005195. [PMID: 27835645 PMCID: PMC5105987 DOI: 10.1371/journal.pcbi.1005195] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Accepted: 10/11/2016] [Indexed: 12/17/2022] Open
Abstract
Susceptibility loci identified by GWAS generally account for a limited fraction of heritability. Predictive models based on identified loci also have modest success in risk assessment and therefore are of limited practical use. Many methods have been developed to overcome these limitations by incorporating prior biological knowledge. However, most of the information utilized by these methods is at the level of genes, limiting analyses to variants that are in or proximate to coding regions. We propose a new method that integrates protein protein interaction (PPI) as well as expression quantitative trait loci (eQTL) data to identify sets of functionally related loci that are collectively associated with a trait of interest. We call such sets of loci “population covering locus sets” (PoCos). The contributions of the proposed approach are three-fold: 1) We consider all possible genotype models for each locus, thereby enabling identification of combinatorial relationships between multiple loci. 2) We develop a framework for the integration of PPI and eQTL into a heterogenous network model, enabling efficient identification of functionally related variants that are associated with the disease. 3) We develop a novel method to integrate the genotypes of multiple loci in a PoCo into a representative genotype to be used in risk assessment. We test the proposed framework in the context of risk assessment for seven complex diseases, type 1 diabetes (T1D), type 2 diabetes (T2D), psoriasis (PS), bipolar disorder (BD), coronary artery disease (CAD), hypertension (HT), and multiple sclerosis (MS). Our results show that the proposed method significantly outperforms individual variant based risk assessment models as well as the state-of-the-art polygenic score. We also show that incorporation of eQTL data improves the performance of identified POCOs in risk assessment. We also assess the biological relevance of PoCos for three diseases that have similar biological mechanisms and identify novel candidate genes. The resulting software is publicly available at http://compbio.case.edu/pocos/. Several studies try to predict the individual disease risk using genetic data obtained from genome wide association studies (GWAS). Earlier studies only focus on individual genetic variants. However, studies on disease mechanisms suggest the aggregation of genomic variants may contribute to diseases. For this reason, researchers commonly use prior biological knowledge to identify genetic variants that are functionally related. However, these approaches are often limited to variants that are in the coding regions of genes. However, several risk variants are in the regulatory region. Here, we incorporate known regulatory and functional interactions to find sets of genetic variants which are informative features for risk assessment. Our result on seven complex diseases show that our method outperforms individual variant based risk assessment models, as well as other methods that integrate multiple genetic variants.
Collapse
Affiliation(s)
- Marzieh Ayati
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, Ohio, United States of America
- * E-mail:
| | - Mehmet Koyutürk
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, Ohio, United States of America
- Center of Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, Ohio, United States of America
| |
Collapse
|
37
|
A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction. Comput Biol Chem 2016; 65:193-202. [PMID: 27765491 DOI: 10.1016/j.compbiolchem.2016.09.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Accepted: 09/07/2016] [Indexed: 11/20/2022]
Abstract
BACKGROUND Gene-gene interaction (GGI) is one of the most popular approaches for finding the missing heritability of common complex traits in genetic association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. In order to identify the best interaction model associated with disease susceptibility, MDR compares all possible genotype combinations in terms of their predictability of disease status from a simple binary high(H) and low(L) risk classification. However, this simple binary classification does not reflect the uncertainty of H/L classification. METHODS We regard classifying H/L as equivalent to defining the degree of membership of two risk groups H/L. By adopting the fuzzy set theory, we propose Fuzzy MDR which takes into account the uncertainty of H/L classification. Fuzzy MDR allows the possibility of partial membership of H/L through a membership function which transforms the degree of uncertainty into a [0,1] scale. The best genotype combinations can be selected which maximizes a new fuzzy set based accuracy measure. RESULTS Two simulation studies are conducted to compare the power of the proposed Fuzzy MDR with that of MDR. Our results show that Fuzzy MDR has higher power than MDR. We illustrate the proposed Fuzzy MDR by analysing bipolar disorder (BD) trait of the WTCCC dataset to detect GGI associated with BD. CONCLUSIONS We propose a novel Fuzzy MDR method to detect gene-gene interaction by taking into account the uncertainly of H/L classification and show that it has higher power than MDR. Fuzzy MDR can be easily extended to handle continuous phenotypes as well. The program written in R for the proposed Fuzzy MDR is available at https://statgen.snu.ac.kr/software/FuzzyMDR.
Collapse
|
38
|
Rovaris DL, Aroche AP, da Silva BS, Kappel DB, Pezzi JC, Levandowski ML, Hess ARB, Schuch JB, de Almeida RMM, Grassi-Oliveira R, Bau CHD. Glucocorticoid receptor gene modulates severity of depression in women with crack cocaine addiction. Eur Neuropsychopharmacol 2016; 26:1438-1447. [PMID: 27397864 DOI: 10.1016/j.euroneuro.2016.06.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Revised: 05/03/2016] [Accepted: 06/18/2016] [Indexed: 12/12/2022]
Abstract
Crack cocaine addicted inpatients that present more severe withdrawal symptoms also exhibit higher rates of depressive symptoms. There is strong evidence that the identification of genetic variants in depression is potentialized when reducing phenotypic heterogeneity by studying selected groups. Since depression has been associated to dysregulation of the hypothalamic-pituitary-adrenal axis, this study evaluated the effects of SNPs in stress-related genes on depressive symptoms of crack cocaine addicts at early abstinence and over the detoxification treatment (4th, 11th and 18th day post admission). Also, the role of these SNPs on the re-hospitalization rates after 2.5 years of follow-up was studied. One hundred eight-two women were enrolled and eight SNPs in four genes (NR3C2, NR3C1, FKBP5 and CRHR1) were genotyped. A significant main effect of NR3C1-rs41423247 was found, where the C minor allele increased depressive symptoms at early abstinence. This effect remained significant after 10,000 permutations to account for multiple SNPs tested (P=0.0077). There was no effect of rs41423247 on the course of detoxification treatment, but a slight effect of rs41423247 at late abstinence was detected (P=0.0463). This analysis suggests that the presence of at least one C allele is worse at early abstinence, while only CC genotype appears to increase depressive symptoms at late abstinence. Also, a slight effect of rs41423247 C minor allele increasing the number of re-hospitalizations after 2.5 years was found (P=0.0413). These findings are in agreement with previous studies reporting an influence of rs41423247 on sensitivity to glucocorticoids and further elucidate its resulting effects on depressive-related traits.
Collapse
Affiliation(s)
- Diego L Rovaris
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Angelita P Aroche
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil; Health Sciences Institute, Universidade Feevale, Novo Hamburgo, Brazil
| | - Bruna S da Silva
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Djenifer B Kappel
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Júlio C Pezzi
- Postgraduate Program in Health Sciences, Universidade Federal de Ciências da Saúde de Porto Alegre, Brazil
| | - Mateus L Levandowski
- Developmental Cognitive Neuroscience Lab (DCNL), Post-Graduate Program in Psychology, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Brazil
| | - Adriana R B Hess
- Institute of Psychology, Laboratory of Experimental Psychology, Neuroscience and Behavior (LPNeC), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Jaqueline B Schuch
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Rosa M M de Almeida
- Institute of Psychology, Laboratory of Experimental Psychology, Neuroscience and Behavior (LPNeC), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Rodrigo Grassi-Oliveira
- Developmental Cognitive Neuroscience Lab (DCNL), Post-Graduate Program in Psychology, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Brazil
| | - Claiton H D Bau
- Department of Genetics, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
| |
Collapse
|
39
|
Yee J, Kim Y, Park T, Park M. Using the Generalized Index of Dissimilarity to Detect Gene-Gene Interactions in Multi-Class Phenotypes. PLoS One 2016; 11:e0158668. [PMID: 27556585 PMCID: PMC4996517 DOI: 10.1371/journal.pone.0158668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2016] [Accepted: 06/20/2016] [Indexed: 01/11/2023] Open
Abstract
To find genetic association between complex diseases and phenotypic traits, one important procedure is conducting a joint analysis. Multifactor dimensionality reduction (MDR) is an efficient method of examining the interactions between genes in genetic association studies. It commonly assumes a dichotomous classification of the binary phenotypes. Its usual approach to determining the genomic association is to construct a confusion matrix to estimate a classification error, where a binary risk status is determined and assigned to each genotypic multifactor class. While multi-class phenotypes are commonly observed, the current MDR approach does not handle these phenotypes appropriately because the thresholds for the risk statuses may not be clear. In this study, we suggest a new method for estimating gene-gene interactions for multi-class phenotypes. Our approach adopts the index of dissimilarity (IDS) as an evaluation measure. This is analytically equivalent to the common association measure of balanced accuracy (BA) for the binary traits, while it is not required to determine the risk status for the estimation. Moreover, it is easily expandable to the generalized index of dissimilarity (GIDS), which has an explicit form that can handle any number of categories. The performance of the proposed method was compared with those of other approaches via simulation studies in which fifteen genetic models were generated with three class outcomes. A consistently better performance was observed using the proposed method. The effect of a varying number of categories was examined. The proposed method was also illustrated using real genome-wide association studies (GWAS) data from the Korean Association Resource (KARE) project.
Collapse
Affiliation(s)
- Jaeyong Yee
- Department of Physiology and Biophysics, Eulji University, Daejeon, Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon, Korea
- * E-mail:
| |
Collapse
|
40
|
Kaczmarczyk M, Loniewska B, Kuprjanowicz A, Binczak-Kuleta A, Goracy I, Ryder M, Taryma-Lesniak O, Ciechanowicz A. Association Between RET (rs1800860) and GFRA1 (rs45568534, rs8192663, rs181595401, rs7090693, and rs2694770) Variants and Kidney Size in Healthy Newborns. Genet Test Mol Biomarkers 2016; 20:624-628. [PMID: 27533506 DOI: 10.1089/gtmb.2016.0079] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Abnormal congenital nephron number has been implicated in the pathogenesis of hypertension and renal disease. The RET receptor complex propagates signals essential for nephrogenesis and the RET c.1296G>A polymorphism, leading to aberrant splicing of exon 7, is associated with reduced kidney volume, a surrogate for nephron endowment. The glial cell-derived neurotrophic factor (GDNF) family receptor alpha 1 (GFRA1) is a component of the RET receptor complex, and three alternatively spliced GFRA1 transcripts (with or without exon 5) have been identified. In rats, exclusion of exon 5 results in stronger GDNF binding affinity and RET activation. The aims of this study were to investigate further the relationship between RET c.1296G>A and kidney volume, and also to investigate the association between the GFRA1 polymorphisms near and within the alternatively spliced exon 5, as well as the functional 5'-UTR c.-193C>G with kidney volume. MATERIALS AND METHODS The study included 188 healthy full-term newborns. Genotyping of the RET (NM_020975.4:c.1296G>A, rs1800860) and GFRA1 (NM_005264.5:c.-193C>G, rs45568534; c.419-87A>G, rs8192663; c.429G>A, rs181595401; c.433+127A>G, rs7090693; c.433+245A>G, rs2694770) polymorphisms was performed using polymerase chain reaction-restriction fragment length polymorphism, minisequencing, or sequencing. Total kidney volume (TKV) was determined by ultrasound and normalized to body surface area (TKV/BSA). Both marker-by-marker and haplotype-based methods were used to test for associations between polymorphisms and TKV/BSA. RESULTS TKV/BSA in RET c.1296A allele carriers was significantly lower compared with GG homozygotes (103 ± 23 vs. 110 ± 19 mL/m2, p = 0.034). c.429G>A was invariant in our sample. There was no association between any of the GFRA1 polymorphisms and renal volume. CONCLUSIONS RET c.1296A may be a common susceptibility allele for nephron underdosing-related diseases. The 5'-UTR and intronic variants near exon 5 of GFRA1 are not associated with nephron endowment.
Collapse
Affiliation(s)
- Mariusz Kaczmarczyk
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| | - Beata Loniewska
- 2 Department of Neonatal Diseases, Pomeranian Medical University , Szczecin, Poland
| | - Anna Kuprjanowicz
- 3 Department of Radiology, Pomeranian Medical University , Szczecin, Poland
| | - Agnieszka Binczak-Kuleta
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| | - Iwona Goracy
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| | - Malgorzata Ryder
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| | - Olga Taryma-Lesniak
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| | - Andrzej Ciechanowicz
- 1 Department of Clinical and Molecular Biochemistry, Pomeranian Medical University , Szczecin, Poland
| |
Collapse
|
41
|
An L, Lin Y, Yang T, Hua L. Exploring the interaction among EPHX1, GSTP1, SERPINE2, and TGFB1 contributing to the quantitative traits of chronic obstructive pulmonary disease in Chinese Han population. Hum Genomics 2016; 10:13. [PMID: 27193053 PMCID: PMC4870730 DOI: 10.1186/s40246-016-0076-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 05/10/2016] [Indexed: 12/17/2022] Open
Abstract
Background Currently, the majority of genetic association studies on chronic obstructive pulmonary disease (COPD) risk focused on identifying the individual effects of single nucleotide polymorphisms (SNPs) as well as their interaction effects on the disease. However, conventional genetic studies often use binary disease status as the primary phenotype, but for COPD, many quantitative traits have the potential correlation with the disease status and closely reflect pathological changes. Method Here, we genotyped 44 SNPs from four genes (EPHX1, GSTP1, SERPINE2, and TGFB1) in 310 patients and 203 controls which belonged to the Chinese Han population to test the two-way and three-way genetic interactions with COPD-related quantitative traits using recently developed generalized multifactor dimensionality reduction (GMDR) and quantitative multifactor dimensionality reduction (QMDR) algorithms. Results Based on the 310 patients and the whole samples of 513 subjects, the best gene-gene interactions models were detected for four lung-function-related quantitative traits. For the forced expiratory volume in 1 s (FEV1), the best interaction was seen from EPHX1, SERPINE2, and GSTP1. For FEV1%pre, the forced vital capacity (FVC), and FEV1/FVC, the best interactions were seen from SERPINE2 and TGFB1. Conclusion The results of this study provide further evidence for the genotype combinations at risk of developing COPD in Chinese Han population and improve the understanding on the genetic etiology of COPD and COPD-related quantitative traits. Electronic supplementary material The online version of this article (doi:10.1186/s40246-016-0076-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Li An
- Beijing Key Laboratory of Respiratory and Pulmonary Circulation Disorders, Beijing Institute of Respiratory Medicine, Department of Respiratory and Critical Care Medicine, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, 100020, China
| | - Yingxiang Lin
- Beijing Key Laboratory of Respiratory and Pulmonary Circulation Disorders, Beijing Institute of Respiratory Medicine, Department of Respiratory and Critical Care Medicine, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, 100020, China
| | - Ting Yang
- Department of Respiratory and Critical Care Medicine, China-Japan Friendship Hospital, Beijing, 100029, China
| | - Lin Hua
- School of Biomedical Engineering, Capital Medical University, Beijing, 100069, China. .,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, School of Biomedical Engineering, Capital Medical University, Beijing, 100069, China.
| |
Collapse
|
42
|
De R, Verma SS, Drenos F, Holzinger ER, Holmes MV, Hall MA, Crosslin DR, Carrell DS, Hakonarson H, Jarvik G, Larson E, Pacheco JA, Rasmussen-Torvik LJ, Moore CB, Asselbergs FW, Moore JH, Ritchie MD, Keating BJ, Gilbert-Diamond D. Identifying gene-gene interactions that are highly associated with Body Mass Index using Quantitative Multifactor Dimensionality Reduction (QMDR). BioData Min 2015; 8:41. [PMID: 26674805 PMCID: PMC4678717 DOI: 10.1186/s13040-015-0074-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 12/04/2015] [Indexed: 11/22/2022] Open
Abstract
Background Despite heritability estimates of 40–70 % for obesity, less than 2 % of its variation is explained by Body Mass Index (BMI) associated loci that have been identified so far. Epistasis, or gene-gene interactions are a plausible source to explain portions of the missing heritability of BMI. Methods Using genotypic data from 18,686 individuals across five study cohorts – ARIC, CARDIA, FHS, CHS, MESA – we filtered SNPs (Single Nucleotide Polymorphisms) using two parallel approaches. SNPs were filtered either on the strength of their main effects of association with BMI, or on the number of knowledge sources supporting a specific SNP-SNP interaction in the context of BMI. Filtered SNPs were specifically analyzed for interactions that are highly associated with BMI using QMDR (Quantitative Multifactor Dimensionality Reduction). QMDR is a nonparametric, genetic model-free method that detects non-linear interactions associated with a quantitative trait. Results We identified seven novel, epistatic models with a Bonferroni corrected p-value of association < 0.1. Prior experimental evidence helps explain the plausible biological interactions highlighted within our results and their relationship with obesity. We identified interactions between genes involved in mitochondrial dysfunction (POLG2), cholesterol metabolism (SOAT2), lipid metabolism (CYP11B2), cell adhesion (EZR), cell proliferation (MAP2K5), and insulin resistance (IGF1R). Moreover, we found an 8.8 % increase in the variance in BMI explained by these seven SNP-SNP interactions, beyond what is explained by the main effects of an index FTO SNP and the SNPs within these interactions. We also replicated one of these interactions and 58 proxy SNP-SNP models representing it in an independent dataset from the eMERGE study. Conclusion This study highlights a novel approach for discovering gene-gene interactions by combining methods such as QMDR with traditional statistics. Electronic supplementary material The online version of this article (doi:10.1186/s13040-015-0074-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rishika De
- Computational Genetics Laboratory, Department of Genetics, Geisel School of Medicine at Dartmouth, Dartmouth-Hitchcock Medical Center, 706 Rubin Building, HB7937, One Medical Center Dr, Lebanon, NH 03756 USA
| | - Shefali S Verma
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, The Pennsylvania State University, University Park, PA 16802 USA
| | - Fotios Drenos
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, 5 University Street, London, WC1E 6JF UK
| | - Emily R Holzinger
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, The Pennsylvania State University, University Park, PA 16802 USA
| | - Michael V Holmes
- Division of Transplant Surgery, Perelman School of Medicine, University of Pennsylvania, 3400 Spruce Street, 2 Dulles Pvln, Philadelphia, PA 19104 USA
| | - Molly A Hall
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, The Pennsylvania State University, University Park, PA 16802 USA
| | - David R Crosslin
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195-5065 USA
| | - David S Carrell
- Group Health Research Institute, Metropolitan Park East, 1730 Minor Avenue, Suite 1600, Seattle, WA 98101-1448 USA
| | - Hakon Hakonarson
- The Joseph Stokes Jr. Research Institute, The Children's Hospital of Philadelphia, Office 1016 Abramson Building, Room 1216E, 3615 Civic Center Blvd, Philadelphia, PA 19104 USA
| | - Gail Jarvik
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195-5065 USA ; Division of Medical Genetics, Department of Medicine, University of Washington, Health Sciences Building, K-253B, Medical Genetics, Box 357720, Seattle, WA 98195-7720 USA
| | - Eric Larson
- Group Health Research Institute, Metropolitan Park East, 1730 Minor Avenue, Suite 1600, Seattle, WA 98101-1448 USA
| | - Jennifer A Pacheco
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, 303 E. Superior Street, Lurie 7-125, Chicago, IL 60611 USA
| | - Laura J Rasmussen-Torvik
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, 680 N Lake Shore Drive, Suite 1400, Chicago, IL 60611 USA
| | - Carrie B Moore
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, The Pennsylvania State University, University Park, PA 16802 USA ; Center for Human Genetics Research, Vanderbilt University School of Medicine, 519 Light Hall, Nashville, TN 37232 USA
| | - Folkert W Asselbergs
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Room E03.511, P.O. Box 85500, 3508 GA Utrecht, The Netherlands ; Institute of Cardiovascular Science, University College London, London, UK ; Durrer Center for Cardiogenetic Research, ICIN-Netherlands Heart Institute, Utrecht, The Netherlands
| | - Jason H Moore
- Institute for Biomedical Informatics, The Perelman School of Medicine, University of Pennsylvania, 1418 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104-6021 USA
| | - Marylyn D Ritchie
- Center for Systems Genomics, Department of Biochemistry and Molecular Biology, 512 Wartik Laboratory, The Pennsylvania State University, University Park, PA 16802 USA
| | - Brendan J Keating
- The Joseph Stokes Jr. Research Institute, The Children's Hospital of Philadelphia, Office 1016 Abramson Building, Room 1216E, 3615 Civic Center Blvd, Philadelphia, PA 19104 USA ; University Medical Center Utrecht, Utrecht, The Netherlands
| | - Diane Gilbert-Diamond
- Institute for Quantitative Biomedical Sciences at Dartmouth, Hanover, NH USA ; Department of Epidemiology, Geisel School of Medicine at Dartmouth, One Medical Center Drive, 7927 Rubin Building, Lebanon, NH 03756 USA
| |
Collapse
|
43
|
Niel C, Sinoquet C, Dina C, Rocheleau G. A survey about methods dedicated to epistasis detection. Front Genet 2015; 6:285. [PMID: 26442103 PMCID: PMC4564769 DOI: 10.3389/fgene.2015.00285] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2015] [Accepted: 08/27/2015] [Indexed: 12/25/2022] Open
Abstract
During the past decade, findings of genome-wide association studies (GWAS) improved our knowledge and understanding of disease genetics. To date, thousands of SNPs have been associated with diseases and other complex traits. Statistical analysis typically looks for association between a phenotype and a SNP taken individually via single-locus tests. However, geneticists admit this is an oversimplified approach to tackle the complexity of underlying biological mechanisms. Interaction between SNPs, namely epistasis, must be considered. Unfortunately, epistasis detection gives rise to analytic challenges since analyzing every SNP combination is at present impractical at a genome-wide scale. In this review, we will present the main strategies recently proposed to detect epistatic interactions, along with their operating principle. Some of these methods are exhaustive, such as multifactor dimensionality reduction, likelihood ratio-based tests or receiver operating characteristic curve analysis; some are non-exhaustive, such as machine learning techniques (random forests, Bayesian networks) or combinatorial optimization approaches (ant colony optimization, computational evolution system).
Collapse
Affiliation(s)
- Clément Niel
- Computer Science Institute of Nantes-Atlantic (Lina), Centre National de la Recherche Scientifique UMR 6241, Ecole Polytechnique de l'Université de Nantes Nantes, France
| | - Christine Sinoquet
- Computer Science Institute of Nantes-Atlantic (Lina), Centre National de la Recherche Scientifique UMR 6241, University of Nantes Nantes, France
| | - Christian Dina
- Institut du Thorax, Institut National de la Santé et de la Recherche Médicale UMR 1087, Centre National de la Recherche Scientifique UMR 6291, University of Nantes Nantes, France
| | - Ghislain Rocheleau
- European Genomic Institute for Diabetes FR3508, Centre National de la Recherche Scientifique UMR 8199, Lille 2 University Lille, France
| |
Collapse
|
44
|
Kim Y, Park T. Robust Gene-Gene Interaction Analysis in Genome Wide Association Studies. PLoS One 2015; 10:e0135016. [PMID: 26267341 PMCID: PMC4534386 DOI: 10.1371/journal.pone.0135016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 07/17/2015] [Indexed: 11/19/2022] Open
Abstract
Genome-wide association studies (GWAS) have successfully discovered hundreds of associations between genetic variants and complex traits. Most GWAS have focused on the identification of single variants. It has been shown that most of the variants that were discovered by GWAS could only partially explain disease heritability. The explanation for this missing heritability is generally believed to be gene-gene (GG) or gene-environment (GE) interactions and other structural variants. Generalized multifactor dimensionality reduction (GMDR) has been proven to be reasonably powerful in detecting GG and GE interactions; however, its performance has been found to decline when outlying quantitative traits are present. This paper proposes a robust GMDR estimation method (based on the L-estimator and M-estimator estimation methods) in an attempt to reduce the effects caused by outlying traits. A comparison of robust GMDR with the original MDR based on simulation studies showed the former method to outperform the latter. The performance of robust GMDR is illustrated through a real GWA example consisting of 8,577 samples from the Korean population using the Homeostasis Model Assessment of Insulin Resistance (HOMA-IR) level as a phenotype. Robust GMDR identified the KCNH1 gene to have strong interaction effects with other genes on the function of insulin secretion.
Collapse
Affiliation(s)
- Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151–741, South Korea
| |
Collapse
|
45
|
Weiss TL, Zieselman A, Hill DP, Diamond SG, Shen L, Saykin AJ, Moore JH. The role of visualization and 3-D printing in biological data mining. BioData Min 2015; 8:22. [PMID: 26246856 PMCID: PMC4526295 DOI: 10.1186/s13040-015-0056-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 07/30/2015] [Indexed: 11/14/2022] Open
Abstract
Background Biological data mining is a powerful tool that can provide a wealth of information about patterns of genetic and genomic biomarkers of health and disease. A potential disadvantage of data mining is volume and complexity of the results that can often be overwhelming. It is our working hypothesis that visualization methods can greatly enhance our ability to make sense of data mining results. More specifically, we propose that 3-D printing has an important role to play as a visualization technology in biological data mining. We provide here a brief review of 3-D printing along with a case study to illustrate how it might be used in a research setting. Results We present as a case study a genetic interaction network associated with grey matter density, an endophenotype for late onset Alzheimer’s disease, as a physical model constructed with a 3-D printer. The synergy or interaction effects of multiple genetic variants were represented through a color gradient of the physical connections between nodes. The digital gene-gene interaction network was then 3-D printed to generate a physical network model. Conclusions The physical 3-D gene-gene interaction network provided an easily manipulated, intuitive and creative way to visualize the synergistic relationships between the genetic variants and grey matter density in patients with late onset Alzheimer’s disease. We discuss the advantages and disadvantages of this novel method of biological data mining visualization.
Collapse
Affiliation(s)
- Talia L Weiss
- Department of Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| | - Amanda Zieselman
- Department of Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| | - Douglas P Hill
- Department of Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| | | | - Li Shen
- Center for Neuroimaging and Indiana Alzheimer's Disease Center, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202 USA
| | - Andrew J Saykin
- Center for Neuroimaging and Indiana Alzheimer's Disease Center, Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202 USA
| | - Jason H Moore
- Department of Genetics, Institute for Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA ; Division of Informatics, Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6021 USA
| | | |
Collapse
|
46
|
A Comparative Study on Multifactor Dimensionality Reduction Methods for Detecting Gene-Gene Interactions with the Survival Phenotype. BIOMED RESEARCH INTERNATIONAL 2015; 2015:671859. [PMID: 26339630 PMCID: PMC4538337 DOI: 10.1155/2015/671859] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Revised: 04/18/2015] [Accepted: 04/27/2015] [Indexed: 11/17/2022]
Abstract
Genome-wide association studies (GWAS) have extensively analyzed single SNP effects on a wide variety of common and complex diseases and found many genetic variants associated with diseases. However, there is still a large portion of the genetic variants left unexplained. This missing heritability problem might be due to the analytical strategy that limits analyses to only single SNPs. One of possible approaches to the missing heritability problem is to consider identifying multi-SNP effects or gene-gene interactions. The multifactor dimensionality reduction method has been widely used to detect gene-gene interactions based on the constructive induction by classifying high-dimensional genotype combinations into one-dimensional variable with two attributes of high risk and low risk for the case-control study. Many modifications of MDR have been proposed and also extended to the survival phenotype. In this study, we propose several extensions of MDR for the survival phenotype and compare the proposed extensions with earlier MDR through comprehensive simulation studies.
Collapse
|
47
|
Yee J, Kwon MS, Jin S, Park T, Park M. Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure. BIOMED RESEARCH INTERNATIONAL 2015; 2015:523641. [PMID: 26339620 PMCID: PMC4538333 DOI: 10.1155/2015/523641] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 02/04/2015] [Accepted: 03/08/2015] [Indexed: 11/17/2022]
Abstract
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait.
Collapse
Affiliation(s)
- Jaeyong Yee
- Department of Physiology and Biophysics, Eulji University, Daejeon, Republic of Korea
| | - Min-Seok Kwon
- Department of Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Seohoon Jin
- Department of Informational Statistics, Korea University, Jochiwon, Republic of Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Republic of Korea
| | - Mira Park
- Department of Preventive Medicine, Eulji University, Daejeon, Republic of Korea
| |
Collapse
|
48
|
Yu W, Kwon MS, Park T. Multivariate Quantitative Multifactor Dimensionality Reduction for Detecting Gene-Gene Interactions. Hum Hered 2015. [PMID: 26201702 DOI: 10.1159/000377723] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES To determine gene-gene interactions and missing heritability of complex diseases is a challenging topic in genome-wide association studies. The multifactor dimensionality reduction (MDR) method is one of the most commonly used methods for identifying gene-gene interactions with dichotomous phenotypes. For quantitative phenotypes, the generalized MDR or quantitative MDR (QMDR) methods have been proposed. These methods are known as univariate methods because they consider only one phenotype. To date, there are few methods for analyzing multiple phenotypes. METHODS To address this problem, we propose a multivariate QMDR method (Multi-QMDR) for multivariate correlated phenotypes. We summarize the multivariate phenotypes into a univariate score by dimensional reduction analysis, and then classify the samples accordingly into high-risk and low-risk groups. We use different ways of summarizing mainly based on the principal components. Multi-QMDR is model-free and easy to implement. RESULTS Multi-QMDR is applied to lipid-related traits. The properties of Multi- QMDR were investigated through simulation studies. Empirical studies show that Multi-QMDR outperforms existing univariate and multivariate methods at identifying causal interactions. CONCLUSIONS The Multi-QMDR approach improves the performance of QMDR when multiple quantitative phenotypes are available.
Collapse
Affiliation(s)
- Wenbao Yu
- Department of Statistic, Seoul National University, Seoul, South Korea
| | | | | |
Collapse
|
49
|
Aliloo H, Pryce JE, González-Recio O, Cocks BG, Hayes BJ. Validation of markers with non-additive effects on milk yield and fertility in Holstein and Jersey cows. BMC Genet 2015; 16:89. [PMID: 26193888 PMCID: PMC4509610 DOI: 10.1186/s12863-015-0241-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 06/25/2015] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND It has been suggested that traits with low heritability, such as fertility, may have proportionately more genetic variation arising from non-additive effects than traits with higher heritability, such as milk yield. Here, we performed a large genome scan with 408,255 single nucleotide polymorphism (SNP) markers to identify chromosomal regions associated with additive, dominance and epistatic (pairwise additive × additive) variability in milk yield and a measure of fertility, calving interval, using records from a population of 7,055 Holstein cows. The results were subsequently validated in an independent set of 3,795 Jerseys. RESULTS We identified genomic regions with validated additive effects on milk yield on Bos taurus autosomes (BTA) 5, 14 and 20, whereas SNPs with suggestive additive effects on fertility were observed on BTA 5, 9, 11, 18, 22, 27, 29 and the X chromosome. We also confirmed genome regions with suggestive dominance effects for milk yield (BTA 2, 3, 5, 26 and 27) and for fertility (BTA 1, 2, 3, 7, 23, 25 and 28). A number of significant epistatic effects for milk yield on BTA 14 were found across breeds. However on close inspection, these were likely to be associated with the mutation in the diacylglycerol O-acyltransferase 1 (DGAT1) gene, given that the associations were no longer significant when the additive effect of the DGAT1 mutation was included in the epistatic model. CONCLUSIONS In general, we observed a low statistical power (high false discovery rates and small number of significant SNPs) for non-additive genetic effects compared with additive effects for both traits which could be an artefact of higher dependence on linkage disequilibrium between markers and causative mutations or smaller size of non-additive effects relative to additive effects. The results of our study suggest that individual non-additive effects make a small contribution to the genetic variation of milk yield and fertility. Although we found no individual mutation with large dominance effect for both traits under investigation, a contribution to genetic variance is still possible from a large number of small dominance effects, so methods that simultaneously incorporate genotypes across all loci are suggested to test the variance explained by dominance gene actions.
Collapse
Affiliation(s)
- Hassan Aliloo
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. .,Dairy Futures Cooperative Research Centre (CRC), AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia.
| | - Jennie E Pryce
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. .,Dairy Futures Cooperative Research Centre (CRC), AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia.
| | - Oscar González-Recio
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia. .,Dairy Futures Cooperative Research Centre (CRC), AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia.
| | - Benjamin G Cocks
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. .,Dairy Futures Cooperative Research Centre (CRC), AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia.
| | - Ben J Hayes
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. .,Dairy Futures Cooperative Research Centre (CRC), AgriBio, 5 Ring Road, Bundoora, VIC, 3083, Australia.
| |
Collapse
|
50
|
Gola D, Mahachie John JM, van Steen K, König IR. A roadmap to multifactor dimensionality reduction methods. Brief Bioinform 2015; 17:293-308. [PMID: 26108231 PMCID: PMC4793893 DOI: 10.1093/bib/bbv038] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Indexed: 02/02/2023] Open
Abstract
Complex diseases are defined to be determined by multiple genetic and environmental factors alone as well as in interactions. To analyze interactions in genetic data, many statistical methods have been suggested, with most of them relying on statistical regression models. Given the known limitations of classical methods, approaches from the machine-learning community have also become attractive. From this latter family, a fast-growing collection of methods emerged that are based on the Multifactor Dimensionality Reduction (MDR) approach. Since its first introduction, MDR has enjoyed great popularity in applications and has been extended and modified multiple times. Based on a literature search, we here provide a systematic and comprehensive overview of these suggested methods. The methods are described in detail, and the availability of implementations is listed. Most recent approaches offer to deal with large-scale data sets and rare variants, which is why we expect these methods to even gain in popularity.
Collapse
|