1
|
Tang DY, Mao YJ, Zhao J, Yang J, Li SY, Ren FX, Zheng J. SEEI: spherical evolution with feedback mechanism for identifying epistatic interactions. BMC Genomics 2024; 25:462. [PMID: 38735952 DOI: 10.1186/s12864-024-10373-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 05/03/2024] [Indexed: 05/14/2024] Open
Abstract
BACKGROUND Detecting epistatic interactions (EIs) involves the exploration of associations among single nucleotide polymorphisms (SNPs) and complex diseases, which is an important task in genome-wide association studies. The EI detection problem is dependent on epistasis models and corresponding optimization methods. Although various models and methods have been proposed to detect EIs, identifying EIs efficiently and accurately is still a challenge. RESULTS Here, we propose a linear mixed statistical epistasis model (LMSE) and a spherical evolution approach with a feedback mechanism (named SEEI). The LMSE model expands the existing single epistasis models such as LR-Score, K2-Score, Mutual information, and Gini index. The SEEI includes an adaptive spherical search strategy and population updating strategy, which ensures that the algorithm is not easily trapped in local optima. We analyzed the performances of 8 random disease models, 12 disease models with marginal effects, 30 disease models without marginal effects, and 10 high-order disease models. The 60 simulated disease models and a real breast cancer dataset were used to evaluate eight algorithms (SEEI, EACO, EpiACO, FDHEIW, MP-HS-DHSI, NHSA-DHSC, SNPHarvester, CSE). Three evaluation criteria (pow1, pow2, pow3), a T-test, and a Friedman test were used to compare the performances of these algorithms. The results show that the SEEI algorithm (order 1, averages ranks = 13.125) outperformed the other algorithms in detecting EIs. CONCLUSIONS Here, we propose an LMSE model and an evolutionary computing method (SEEI) to solve the optimization problem of the LMSE model. The proposed method performed better than the other seven algorithms tested in its ability to identify EIs in genome-wide association datasets. We identified new SNP-SNP combinations in the real breast cancer dataset and verified the results. Our findings provide new insights for the diagnosis and treatment of breast cancer. AVAILABILITY AND IMPLEMENTATION https://github.com/scutdy/SSO/blob/master/SEEI.zip .
Collapse
Affiliation(s)
- De-Yu Tang
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Yi-Jun Mao
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
| | - Jie Zhao
- School of Management, Guangdong University of Technology, Guangzhou, 510006, PR China
| | - Jin Yang
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Shi-Yin Li
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Fu-Xiang Ren
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Junxi Zheng
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| |
Collapse
|
2
|
Yang CH, Huang HC, Hou MF, Chuang LY, Lin YD. Fuzzy-Based Multiobjective Multifactor Dimensionality Reduction for Epistasis Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:378-387. [PMID: 35061588 DOI: 10.1109/tcbb.2022.3144303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Epistasis detection is vital for understanding disease susceptibility in genetics. Multiobjective multifactor dimensionality reduction (MOMDR) was previously proposed to detect epistasis. MOMDR was performed using binary classification to distinguish the high-risk (H) and low-risk (L) groups to reduce multifactor dimensionality. However, the binary classification does not reflect the uncertainty of the H and L classification. In this study, we proposed an empirical fuzzy MOMDR (EFMOMDR) to address the limitations of binary classification using the degree of membership through an empirical fuzzy approach. The EFMOMDR can simultaneously consider two incorporated fuzzy-based measures, including correct classification rate and likelihood rate, and does not require parameter tuning. Simulation studies revealed that EFMOMDR has higher 7.14% detection success rates than MOMDR, indicating that the limitations of binary classification of MOMDR have been successfully improved by empirical fuzzy. Moreover, EFMOMDR was used to analyze coronary artery disease in the Wellcome Trust Case Control Consortium dataset.
Collapse
|
3
|
Chen JB, Yang HS, Moi SH, Chuang LY, Yang CH. Identification of mortality-risk-related missense variant for renal clear cell carcinoma using deep learning. Ther Adv Chronic Dis 2021; 12:2040622321992624. [PMID: 33643601 PMCID: PMC7890720 DOI: 10.1177/2040622321992624] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 01/13/2021] [Indexed: 11/24/2022] Open
Abstract
Introduction: Kidney renal clear cell carcinoma (KIRCC) is a highly heterogeneous and lethal cancer that can arise in patients with renal disease. DeepSurv combines a deep feed-forward neural network with a Cox proportional hazards function and could provide optimized survival results compared with convenient survival analysis. Methods: This study used an improved DeepSurv algorithm to identify the candidate genes to be targeted for treatment on the basis of the overall mortality status of KIRCC subjects. All the somatic mutation missense variants of KIRCC subjects were abstracted from TCGA-KIRC database. Results: The improved DeepSurv model (95.1%) achieved greater balanced accuracy compared with the DeepSurv model (75%), and identified 610 high-risk variants associated with overall mortality. The results of gene differential expression analysis also indicated nine KIRCC mortality-risk-related pathways, namely the tRNA charging pathway, the D-myo-inositol-5-phosphate metabolism pathway, the DNA double-strand break repair by nonhomologous end-joining pathway, the superpathway of inositol phosphate compounds, the 3-phosphoinositide degradation pathway, the production of nitric oxide and reactive oxygen species in macrophages pathway, the synaptic long-term depression pathway, the sperm motility pathway, and the role of JAK2 in hormone-like cytokine signaling pathway. The biological findings in this study indicate the KIRCC mortality-risk-related pathways were more likely to be associated with cancer cell growth, cancer cell differentiation, and immune response inhibition. Conclusion: The results proved that the improved DeepSurv model effectively classified mortality-related high-risk variants and identified the candidate genes. In the context of KIRCC overall mortality, the proposed model effectively recognized mortality-related high-risk variants for KIRCC.
Collapse
Affiliation(s)
- Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, Kaohsiung
| | - Huai-Shuo Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung
| | - Sin-Hua Moi
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, 415 Jiangong Road, San-Min District, Kaohsiung, 82444
| |
Collapse
|
4
|
Yang CH, Moi SH, Chuang LY, Chen JB. Higher-order clinical risk factor interaction analysis for overall mortality in maintenance hemodialysis patients. Ther Adv Chronic Dis 2020; 11:2040622320949060. [PMID: 33062235 PMCID: PMC7534064 DOI: 10.1177/2040622320949060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 07/20/2020] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND AND AIMS In Taiwan, approximately 90% of patients with end-stage renal disease receive maintenance hemodialysis. Although studies have reported the survival predictability of multiclinical factors, the higher-order interactions among these factors have rarely been discussed. Conventional statistical approaches such as regression analysis are inadequate for detecting higher-order interactions. Therefore, this study integrated receiver operating characteristic, logistic regression, and balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction (MDR-ER) analyses to examine the impact of interaction effects between multiclinical factors on overall mortality in patients on maintenance hemodialysis. METERIALS AND METHODS In total, 781 patients who received outpatient hemodialysis dialysis three times per week before 1 January 2009 were included; their baseline clinical factor and mortality outcome data were retrospectively collected using an approved data protocol (201800595B0). RESULTS Consistent with conventional statistical approaches, the higher-order interaction model could indicate the impact of potential risk combination unique to patients on maintenance hemodialysis on the survival outcome, as described previously. Moreover, the MDR-based higher-order interaction model facilitated higher-order interaction effect detection among multiclinical factors and could determine more detailed mortality risk characteristics combinations. CONCLUSION Therefore, higher-order clinical risk interaction analysis is a reasonable strategy for detecting non-traditional risk factor interaction effects on survival outcome unique to patients on maintenance hemodialysis and thus clinically achieving whole-scale patient care.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung Ph.D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung
| | - Sin-Hua Moi
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84004
| | - Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, 123 DaPei Rd, Niao Song Dist, Kaohsiung 83301
| |
Collapse
|
5
|
Yang CH, Lin YD, Chuang LY. Class Balanced Multifactor Dimensionality Reduction to Detect Gene-Gene Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:71-81. [PMID: 30040653 DOI: 10.1109/tcbb.2018.2858776] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Detecting gene-gene interactions in single-nucleotide polymorphism data is vital for understanding disease susceptibility. However, existing approaches may be limited by the sample size in case-control studies. Herein, we propose a balance approach for the multifactor dimensionality reduction (BMDR) method to increase the accuracy of estimates of the prediction error rate in small samples. BMDR explicitly selects the best model by evaluating the average of prediction error rates over k-fold cross-validation without cross-validation consistency selection. In this study, we used several epistatic models with and without marginal effects under different parameter settings (heritability and minor allele frequencies) to evaluate the performance of existing approaches. Using simulated data sets, BMDR successfully detected gene-gene interactions, particularly for data sets with small sample sizes. A large data set was obtained from the Wellcome Trust Case Control Consortium, and results indicated that BMDR could effectively detect significant gene-gene interactions.
Collapse
|
6
|
Application of simulation-based CYP26 SNP-environment barcodes for evaluating the occurrence of oral malignant disorders by odds ratio-based binary particle swarm optimization: A case-control study in the Taiwanese population. PLoS One 2019; 14:e0220719. [PMID: 31465460 PMCID: PMC6715230 DOI: 10.1371/journal.pone.0220719] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 07/22/2019] [Indexed: 12/15/2022] Open
Abstract
Introduction Genetic polymorphisms and social factors (alcohol consumption, betel quid (BQ) usage, and cigarette consumption), both separately or jointly, play a crucial role in the occurrence of oral malignant disorders such as oral and pharyngeal cancers and oral potentially malignant disorders (OPMD). Material and methods Simultaneous analyses of multiple single nucleotide polymorphisms (SNPs) and environmental effects on oral malignant disorders are essential to examine, albeit challenging. Thus, we conducted a case-control study (N = 576) to analyze the risk of occurrence of oral malignant disorders by using binary particle swarm optimization (BPSO) with an odds ratio (OR)-based method. Results We demonstrated that a combination of SNPs (CYP26B1 rs887844 and CYP26C1 rs12256889) and socio-demographic factors (age, ethnicity, and BQ chewing), referred to as the combined effects of SNP-environment, correlated with maximal risk diversity of occurrence observed between the oral malignant disorder group and the control group. The risks were more prominent in the oral and pharyngeal cancers group (OR = 10.30; 95% confidence interval (CI) = 4.58–23.15) than in the OPMD group (OR = 5.42; 95% CI = 1.94–15.12). Conclusions Simulation-based “SNP-environment barcodes” may be used to predict the risk of occurrence of oral malignant disorders. Applying simulation-based “SNP-environment barcodes” may provide insight into the importance of screening tests in preventing oral and pharyngeal cancers and OPMD.
Collapse
|
7
|
Yang CH, Chuang LY, Lin YD. Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions. Bioinformatics 2019; 34:2228-2236. [PMID: 29471406 DOI: 10.1093/bioinformatics/bty076] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 02/16/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Single-nucleotide polymorphism (SNP)-SNP interactions (SSIs) are popular markers for understanding disease susceptibility. Multifactor dimensionality reduction (MDR) can successfully detect considerable SSIs. Currently, MDR-based methods mainly adopt a single-objective function (a single measure based on contingency tables) to detect SSIs. However, generally, a single-measure function might not yield favorable results due to potential model preferences and disease complexities. Approach This study proposes a multiobjective MDR (MOMDR) method that is based on a contingency table of MDR as an objective function. MOMDR considers the incorporated measures, including correct classification and likelihood rates, to detect SSIs and adopts set theory to predict the most favorable SSIs with cross-validation consistency. MOMDR enables simultaneously using multiple measures to determine potential SSIs. Results Three simulation studies were conducted to compare the detection success rates of MOMDR and single-objective MDR (SOMDR), revealing that MOMDR had higher detection success rates than SOMDR. Furthermore, the Wellcome Trust Case Control Consortium dataset was analyzed by MOMDR to detect SSIs associated with coronary artery disease. Availability and implementation: MOMDR is freely available at https://goo.gl/M8dpDg. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.,Graduate Institute of Clinical Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| |
Collapse
|
8
|
Yang CH, Kao YK, Chuang LY, Lin YD. Catfish Taguchi-Based Binary Differential Evolution Algorithm for Analyzing Single Nucleotide Polymorphism Interactions in Chronic Dialysis. IEEE Trans Nanobioscience 2018; 17:291-299. [PMID: 29994217 DOI: 10.1109/tnb.2018.2844342] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Single-nucleotide polymorphism (SNP)-SNP interactions are crucial for understanding the association between disease-related multifactorials for disease analysis. Existing statistical methods for determining such interactions are limited by the considerable computation required for evaluating all potential associations between disease-related multifactorials. Identifying SNP-SNP interactions is thus a major challenge in genetic association studies. This paper proposes a catfish Taguchi-based binary differential evolution (CT-BDE) algorithm for identifying SNP-SNP interactions. In the search space, the catfish effect prevents the premature convergence of the population, and the Taguchi method improves the search ability of the BDE algorithm. Hence, the proposed algorithm enables obtaining a favorable solution regarding the identification of high-order SNP-SNP interactions. Additionally, the proposed algorithm applies an effective fitness function derived from a multifactor dimensionality reduction (MDR) operation to evaluate the solutions from BDE-based algorithms. Simulated and real data sets were used to evaluate the ability of several BDE-based algorithms in identifying specific SNP-SNP interactions. We compared the fitness function derived from the MDR operation with that derived according to the difference between cases and controls, by using the different BDE-based algorithms. The results showed that the proposed CT-BDE algorithm applying the fitness function derived from the MDR operation exhibited a superior ability in identifying SNP-SNP interactions compared with the other BDE-based algorithms.
Collapse
|
9
|
Yang CH, Lin YD, Chuang LY. Multiple-Criteria Decision Analysis-Based Multifactor Dimensionality Reduction for Detecting Gene-Gene Interactions. IEEE J Biomed Health Inform 2018; 23:416-426. [PMID: 29993963 DOI: 10.1109/jbhi.2018.2790951] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Gene-gene interactions (GGIs) are important markers for determining susceptibility to a disease. Multifactor dimensionality reduction (MDR) is a popular algorithm for detecting GGIs and primarily adopts the correct classification rate (CCR) to assess the quality of a GGI. However, CCR measurement alone may not successfully detect certain GGIs because of potential model preferences and disease complexities. In this study, multiple-criteria decision analysis (MCDA) based on MDR was named MCDA-MDR and proposed for detecting GGIs. MCDA facilitates MDR to simultaneously adopt multiple measures within the two-way contingency table of MDR to assess GGIs; the CCR and rule utility measure were employed. Cross-validation consistency was adopted to determine the most favorable GGIs among the Pareto sets. Simulation studies were conducted to compare the detection success rates of the MDR-only-based measure and MCDA-MDR, revealing that MCDA-MDR had superior detection success rates. The Wellcome Trust Case Control Consortium dataset was analyzed using MCDA-MDR to detect GGIs associated with coronary artery disease, and MCDA-MDR successfully detected numerous significant GGIs (p < 0.001). MCDA-MDR performance assessment revealed that the applied MCDA successfully enhanced the GGI detection success rate of the MDR-based method compared with MDR alone.
Collapse
|
10
|
Yang CH, Lin YD, Chuang LY, Chen JB, Chang HW. Joint Analysis of SNP-SNP-Environment Interactions for Chronic Dialysis by an Improved Branch and Bound Algorithm. J Comput Biol 2017; 24:1212-1225. [PMID: 28876085 DOI: 10.1089/cmb.2017.0090] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In previous studies, both single-nucleotide polymorphism (SNP)-SNP or gene-gene (G × G) interactions and SNP-environmental factor (G × E) interactions were reported to partially account for "missing" heritability. However, (G × G) × E interactions were less commonly addressed. The purpose of this study was to develop a novel strategy to evaluate possible (G × G) × E interactions in D-loop-based chronic dialysis association. Using values from our previously published data set (704 controls and 193 cases) of 77 D-loop SNPs and 7 environmental factors (coronary heart disease, hypertension, diabetes mellitus, triglyceride, cholesterol, blood thiol, and TBARS levels), we compared the performances of G, G × G, G × E, and (G × G) × E. We found that the interactions of four individual SNPs previously associated with a significantly high risk of chronic dialysis [odds ratio (OR) = 1.56-4.93] with environmental factors (G × E) increased the risk of chronic dialysis (maximum OR = 35.43). We then used an improved branch and bound algorithm to identify combinations of two to four SNPs that were most highly associated with chronic dialysis (OR = 9.27-34.39). When the interactions of the two- and three-SNP combinations with environmental factors were evaluated, we found that the (G × G) × E effects increased the risk of chronic dialysis (maximum OR = 8.32-57.54 and OR = 12.52-57.81, respectively; adjusted OR = 8.67-81.81 and OR = 12.29-81.95, respectively). Taken together, the (G × G) × E interactions identified chronic dialysis-associated SNPs that would not have been found using G × G or G × E interactions, suggesting that (G × G) × E interactions may be helpful to solve the problems of missing heritability in association studies.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan .,2 Graduate Institute of Clinical Medicine, Kaohsiung Medical University , Kaohsiung, Taiwan
| | - Yu-Da Lin
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- 3 Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University , Kaohsiung, Taiwan
| | - Jin-Bor Chen
- 4 Division of Nephrology, Department of Internal Medicine, Mitochondrial Research Unit, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine , Kaohsiung, Taiwan
| | - Hsueh-Wei Chang
- 5 Institute of Medical Science and Technology, National Sun Yat-Sen University , Kaohsiung, Taiwan .,6 Department of Medical Research, Kaohsiung Medical University Hospital , Kaohsiung, Taiwan .,7 Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University , Kaohsiung, Taiwan
| |
Collapse
|
11
|
Yang CH, Chuang LY, Lin YD. CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies. Bioinformatics 2017; 33:2354-2362. [DOI: 10.1093/bioinformatics/btx163] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Accepted: 03/21/2017] [Indexed: 12/31/2022] Open
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
- Graduate Institute of Clinical Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| |
Collapse
|
12
|
Yang CH, Weng ZJ, Chuang LY, Yang CS. Identification of SNP-SNP interaction for chronic dialysis patients. Comput Biol Med 2017; 83:94-101. [DOI: 10.1016/j.compbiomed.2017.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 02/14/2017] [Accepted: 02/15/2017] [Indexed: 01/10/2023]
|
13
|
Interaction of MRE11 and Clinicopathologic Characteristics in Recurrence of Breast Cancer: Individual and Cumulated Receiver Operating Characteristic Analyses. BIOMED RESEARCH INTERNATIONAL 2017; 2017:2563910. [PMID: 28133604 PMCID: PMC5241446 DOI: 10.1155/2017/2563910] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 11/28/2016] [Indexed: 12/28/2022]
Abstract
The interaction between the meiotic recombination 11 homolog A (MRE11) oncoprotein and breast cancer recurrence status remains unclear. The aim of this study was to assess the interaction between MRE11 and clinicopathologic variables in breast cancer. A dataset for 254 subjects with breast cancer (220 nonrecurrent and 34 recurrent) was used in individual and cumulated receiver operating characteristic (ROC) analyses of MRE11 and 12 clinicopathologic variables for predicting breast cancer recurrence. In individual ROC analysis, the area under curve (AUC) for each predictor of breast cancer recurrence was smaller than 0.7. In cumulated ROC analysis, however, the AUC value for each predictor improved. Ten relevant variables in breast cancer recurrence were used to find the optimal prognostic indicators. The presence of any six of the following ten variables had a high (79%) sensitivity and a high (70%) specificity for predicting breast cancer recurrence: tumor size ≥ 2.4 cm, tumor stage II/III, therapy other than hormone therapy, age ≥ 52 years, MRE11 positive cells > 50%, body mass index ≥ 24, lymph node metastasis, positivity for progesterone receptor, positivity for epidermal growth factor receptor, and negativity for estrogen receptor. In conclusion, this study revealed that these 10 clinicopathologic variables are the minimum discriminators needed for optimal discriminant effectiveness in predicting breast cancer recurrence.
Collapse
|
14
|
Chuang LY, Moi SH, Lin YD, Yang CH. A comparative analysis of chaotic particle swarm optimizations for detecting single nucleotide polymorphism barcodes. Artif Intell Med 2016; 73:23-33. [DOI: 10.1016/j.artmed.2016.09.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 09/29/2016] [Indexed: 01/24/2023]
|
15
|
Fu OY, Chang HW, Lin YD, Chuang LY, Hou MF, Yang CH. Breast cancer-associated high-order SNP-SNP interaction of CXCL12/CXCR4-related genes by an improved multifactor dimensionality reduction (MDR-ER). Oncol Rep 2016; 36:1739-47. [PMID: 27461876 DOI: 10.3892/or.2016.4956] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 03/03/2016] [Indexed: 11/06/2022] Open
Abstract
In association studies, the combined effects of single nucleotide polymorphism (SNP)-SNP interactions and the problem of imbalanced data between cases and controls are frequently ignored. In the present study, we used an improved multifactor dimensionality reduction (MDR) approach namely MDR-ER to detect the high order SNP‑SNP interaction in an imbalanced breast cancer data set containing seven SNPs of chemokine CXCL12/CXCR4 pathway genes. Most individual SNPs were not significantly associated with breast cancer. After MDR‑ER analysis, six significant SNP‑SNP interaction models with seven genes (highest cross‑validation consistency, 10; classification error rates, 41.3‑21.0; and prediction error rates, 47.4‑55.3) were identified. CD4 and VEGFA genes were associated in a 2‑loci interaction model (classification error rate, 41.3; prediction error rate, 47.5; odds ratio (OR), 2.069; 95% bootstrap CI, 1.40‑2.90; P=1.71E‑04) and it also appeared in all the best 2‑7‑loci models. When the loci number increased, the classification error rates and P‑values decreased. The powers in 2‑7‑loci in all models were >0.9. The minimum classification error rate of the MDR‑ER‑generated model was shown with the 7‑loci interaction model (classification error rate, 21.0; OR=15.282; 95% bootstrap CI, 9.54‑23.87; P=4.03E‑31). In the epistasis network analysis, the overall effect with breast cancer susceptibility was identified and the SNP order of impact on breast cancer was identified as follows: CD4 = VEGFA > KITLG > CXCL12 > CCR7 = MMP2 > CXCR4. In conclusion, the MDR‑ER can effectively and correctly identify the best SNP‑SNP interaction models in an imbalanced data set for breast cancer cases.
Collapse
Affiliation(s)
- Ou-Yang Fu
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Hsueh-Wei Chang
- Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan, R.O.C
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I‑Shou University, Kaohsiung 84001, Taiwan, R.O.C
| | - Ming-Feng Hou
- Cancer Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan, R.O.C
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan, R.O.C
| |
Collapse
|
16
|
Yang CH, Lin YD, Yen CY, Chuang LY, Chang HW. A systematic gene-gene and gene-environment interaction analysis of DNA repair genes XRCC1, XRCC2, XRCC3, XRCC4, and oral cancer risk. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2016; 19:238-47. [PMID: 25831063 DOI: 10.1089/omi.2014.0121] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Oral cancer is the sixth most common cancer worldwide with a high mortality rate. Biomarkers that anticipate susceptibility, prognosis, or response to treatments are much needed. Oral cancer is a polygenic disease involving complex interactions among genetic and environmental factors, which require multifaceted analyses. Here, we examined in a dataset of 103 oral cancer cases and 98 controls from Taiwan the association between oral cancer risk and the DNA repair genes X-ray repair cross-complementing group (XRCCs) 1-4, and the environmental factors of smoking, alcohol drinking, and betel quid (BQ) chewing. We employed logistic regression, multifactor dimensionality reduction (MDR), and hierarchical interaction graphs for analyzing gene-gene (G×G) and gene-environment (G×E) interactions. We identified a significantly elevated risk of the XRCC2 rs2040639 heterozygous variant among smokers [adjusted odds ratio (OR) 3.7, 95% confidence interval (CI)=1.1-12.1] and alcohol drinkers [adjusted OR=5.7, 95% CI=1.4-23.2]. The best two-factor based G×G interaction of oral cancer included the XRCC1 rs1799782 and XRCC2 rs2040639 [OR=3.13, 95% CI=1.66-6.13]. For the G×E interaction, the estimated OR of oral cancer for two (drinking-BQ chewing), three (XRCC1-XRCC2-BQ chewing), four (XRCC1-XRCC2-age-BQ chewing), and five factors (XRCC1-XRCC2-age-drinking-BQ chewing) were 32.9 [95% CI=14.1-76.9], 31.0 [95% CI=14.0-64.7], 49.8 [95% CI=21.0-117.7] and 82.9 [95% CI=31.0-221.5], respectively. Taken together, the genotypes of XRCC1 rs1799782 and XRCC2 rs2040639 DNA repair genes appear to be significantly associated with oral cancer. These were enhanced by exposure to certain environmental factors. The observations presented here warrant further research in larger study samples to examine their relevance for routine clinical care in oncology.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- 1 Department of Electronic Engineering, National Kaohsiung University of Applied Sciences , Kaohsiung, Taiwan
| | | | | | | | | |
Collapse
|
17
|
The Combinational Polymorphisms of ORAI1 Gene Are Associated with Preventive Models of Breast Cancer in the Taiwanese. BIOMED RESEARCH INTERNATIONAL 2015; 2015:281263. [PMID: 26380267 PMCID: PMC4561876 DOI: 10.1155/2015/281263] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 01/21/2015] [Indexed: 11/26/2022]
Abstract
The ORAI calcium release-activated calcium modulator 1 (ORAI1) has been proven to be an important gene for breast cancer progression and metastasis. However, the protective association model between the single nucleotide polymorphisms (SNPs) of ORAI1 gene was not investigated. Based on a published data set of 345 female breast cancer patients and 290 female controls, we used a particle swarm optimization (PSO) algorithm to identify the possible protective models of breast cancer association in terms of the SNPs of ORAI1 gene. Results showed that the PSO-generated models of 2-SNP (rs12320939-TT/rs12313273-CC), 3-SNP (rs12320939-TT/rs12313273-CC/rs712853-(TT/TC)), 4-SNP (rs12320939-TT/rs12313273-CC/rs7135617-(GG/GT)/rs712853-(TT/TC)), and 5-SNP (rs12320939-TT/rs12313273-CC/rs7135617-(GG/GT)/rs6486795-CC/rs712853-(TT/TC)) displayed low values of odds ratios (0.409–0.425) for breast cancer association. Taken together, these results suggested that our proposed PSO strategy is powerful to identify the combinational SNPs of rs12320939, rs12313273, rs7135617, rs6486795, and rs712853 of ORAI1 gene with a strongly protective association in breast cancer.
Collapse
|
18
|
Yang CH, Lin YD, Yang CS, Chuang LY. An efficiency analysis of high-order combinations of gene-gene interactions using multifactor-dimensionality reduction. BMC Genomics 2015; 16:489. [PMID: 26126977 PMCID: PMC4487567 DOI: 10.1186/s12864-015-1717-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2015] [Accepted: 06/24/2015] [Indexed: 12/21/2022] Open
Abstract
Background Multifactor dimensionality reduction (MDR) is widely used to analyze interactions of genes to determine the complex relationship between diseases and polymorphisms in humans. However, the astronomical number of high-order combinations makes MDR a highly time-consuming process which can be difficult to implement for multiple tests to identify more complex interactions between genes. This study proposes a new framework, named fast MDR (FMDR), which is a greedy search strategy based on the joint effect property. Results Six models with different minor allele frequencies (MAFs) and different sample sizes were used to generate the six simulation data sets. A real data set was obtained from the mitochondrial D-loop of chronic dialysis patients. Comparison of results from the simulation data and real data sets showed that FMDR identified significant gene–gene interaction with less computational complexity than the MDR in high-order interaction analysis. Conclusion FMDR improves the MDR difficulties associated with the computational loading of high-order SNPs and can be used to evaluate the relative effects of each individual SNP on disease susceptibility. FMDR is freely available at http://bioinfo.kmu.edu.tw/FMDR.rar. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1717-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
| | - Cheng-San Yang
- Department of Plastic Surgery, Chia-Yi Christian Hospital, Chiayi, Taiwan.
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan.
| |
Collapse
|
19
|
High order gene-gene interactions in eight single nucleotide polymorphisms of renin-angiotensin system genes for hypertension association study. BIOMED RESEARCH INTERNATIONAL 2015; 2015:454091. [PMID: 25961019 PMCID: PMC4417588 DOI: 10.1155/2015/454091] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 03/20/2015] [Indexed: 11/18/2022]
Abstract
Several single nucleotide polymorphisms (SNPs) of renin-angiotensin system (RAS) genes are associated with hypertension (HT) but most of them are focusing on single locus effects. Here, we introduce an unbalanced function based on multifactor dimensionality reduction (MDR) for multiloci genotypes to detect high order gene-gene (SNP-SNP) interaction in unbalanced cases and controls of HT data. Eight SNPs of three RAS genes (angiotensinogen, AGT; angiotensin-converting enzyme, ACE; angiotensin II type 1 receptor, AT1R) in HT and non-HT subjects were included that showed no significant genotype differences. In 2- to 6-locus models of the SNP-SNP interaction, the SNPs of AGT and ACE genes were associated with hypertension (bootstrapping odds ratio [Boot-OR] = 1.972~3.785; 95%, confidence interval (CI) 1.26~6.21; P < 0.005). In 7- and 8-locus model, SNP A1166C of AT1R gene is joined to improve the maximum Boot-OR values of 4.050 to 4.483; CI = 2.49 to 7.29; P < 1.63E − 08. In conclusion, the epistasis networks are identified by eight SNP-SNP interaction models. AGT, ACE, and AT1R genes have overall effects with susceptibility to hypertension, where the SNPs of ACE have a mainly hypertension-associated effect and show an interacting effect to SNPs of AGT and AT1R genes.
Collapse
|
20
|
Li CF, Luo FT, Zeng YX, Jia WH. Weighted risk score-based multifactor dimensionality reduction to detect gene-gene interactions in nasopharyngeal carcinoma. Int J Mol Sci 2014; 15:10724-37. [PMID: 24933637 PMCID: PMC4100176 DOI: 10.3390/ijms150610724] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 04/21/2014] [Accepted: 06/03/2014] [Indexed: 12/02/2022] Open
Abstract
Determining the complex relationships between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has been proven to be capable of effectively detecting the statistical patterns of epistasis, although classification accuracy is required for this approach. The imbalanced dataset can cause seriously negative effects on classification accuracy. Moreover, MDR methods cannot quantitatively assess the disease risk of genotype combinations. Hence, we introduce a novel weighted risk score-based multifactor dimensionality reduction (WRSMDR) method that uses the Bayesian posterior probability of polymorphism combinations as a new quantitative measure of disease risk. First, we compared the WRSMDR to the MDR method in simulated datasets. Our results showed that the WRSMDR method had reasonable power to identify high-order gene-gene interactions, and it was more effective than MDR at detecting four-locus models. Moreover, WRSMDR reveals more information regarding the effect of genotype combination on the disease risk, and the result was easier to determine and apply than with MDR. Finally, we applied WRSMDR to a nasopharyngeal carcinoma (NPC) case-control study and identified a statistically significant high-order interaction among three polymorphisms: rs2860580, rs11865086 and rs2305806.
Collapse
Affiliation(s)
- Chao-Feng Li
- Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-sen University, Guangzhou 510080, China.
| | - Fu-Tian Luo
- Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-sen University, Guangzhou 510080, China.
| | - Yi-Xin Zeng
- State Key Laboratory of Oncology in South China, Sun Yat-sen University Cancer Center, Guangzhou 510060, China.
| | - Wei-Hua Jia
- State Key Laboratory of Oncology in South China, Sun Yat-sen University Cancer Center, Guangzhou 510060, China.
| |
Collapse
|
21
|
Double-bottom chaotic map particle swarm optimization based on chi-square test to determine gene-gene interactions. BIOMED RESEARCH INTERNATIONAL 2014; 2014:172049. [PMID: 24895547 PMCID: PMC4033510 DOI: 10.1155/2014/172049] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 04/16/2014] [Indexed: 11/19/2022]
Abstract
Gene-gene interaction studies focus on the investigation of the association between the single nucleotide polymorphisms (SNPs) of genes for disease susceptibility. Statistical methods are widely used to search for a good model of gene-gene interaction for disease analysis, and the previously determined models have successfully explained the effects between SNPs and diseases. However, the huge numbers of potential combinations of SNP genotypes limit the use of statistical methods for analysing high-order interaction, and finding an available high-order model of gene-gene interaction remains a challenge. In this study, an improved particle swarm optimization with double-bottom chaotic maps (DBM-PSO) was applied to assist statistical methods in the analysis of associated variations to disease susceptibility. A big data set was simulated using the published genotype frequencies of 26 SNPs amongst eight genes for breast cancer. Results showed that the proposed DBM-PSO successfully determined two- to six-order models of gene-gene interaction for the risk association with breast cancer (odds ratio > 1.0; P value <0.05). Analysis results supported that the proposed DBM-PSO can identify good models and provide higher chi-square values than conventional PSO. This study indicates that DBM-PSO is a robust and precise algorithm for determination of gene-gene interaction models for breast cancer.
Collapse
|
22
|
Chuang LY, Lane HY, Lin YD, Lin MT, Yang CH, Chang HW. Identification of SNP barcode biomarkers for genes associated with facial emotion perception using particle swarm optimization algorithm. Ann Gen Psychiatry 2014; 13:15. [PMID: 24955105 PMCID: PMC4050220 DOI: 10.1186/1744-859x-13-15] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2014] [Accepted: 04/23/2014] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Facial emotion perception (FEP) can affect social function. We previously reported that parts of five tested single-nucleotide polymorphisms (SNPs) in the MET and AKT1 genes may individually affect FEP performance. However, the effects of SNP-SNP interactions on FEP performance remain unclear. METHODS This study compared patients with high and low FEP performances (n = 89 and 93, respectively). A particle swarm optimization (PSO) algorithm was used to identify the best SNP barcodes (i.e., the SNP combinations and genotypes that revealed the largest differences between the high and low FEP groups). RESULTS The analyses of individual SNPs showed no significant differences between the high and low FEP groups. However, comparisons of multiple SNP-SNP interactions involving different combinations of two to five SNPs showed that the best PSO-generated SNP barcodes were significantly associated with high FEP score. The analyses of the joint effects of the best SNP barcodes for two to five interacting SNPs also showed that the best SNP barcodes had significantly higher odds ratios (2.119 to 3.138; P < 0.05) compared to other SNP barcodes. In conclusion, the proposed PSO algorithm effectively identifies the best SNP barcodes that have the strongest associations with FEP performance. CONCLUSIONS This study also proposes a computational methodology for analyzing complex SNP-SNP interactions in social cognition domains such as recognition of facial emotion.
Collapse
Affiliation(s)
- Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84001, Taiwan
| | - Hsien-Yuan Lane
- Institute of Clinical Medical Science, China Medical University, Taichung 40402, Taiwan ; Department of Psychiatry, China Medical University Hospital, Taichung 40402, Taiwan
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan
| | - Ming-Teng Lin
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84001, Taiwan ; Department of Psychiatry, Taipei Veterans General Hospital, Hsinchu Branch, Hsinchu 31064, Taiwan
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80778, Taiwan
| | - Hsueh-Wei Chang
- Cancer Center, Translational Research Center, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan ; Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 80424, Taiwan ; Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung 80708, Taiwan
| |
Collapse
|