1
|
Ren F, Li S, Wen Z, Liu Y, Tang D. The Spherical Evolutionary Multi-Objective (SEMO) Algorithm for Identifying Disease Multi-Locus SNP Interactions. Genes (Basel) 2023; 15:11. [PMID: 38275593 PMCID: PMC10815643 DOI: 10.3390/genes15010011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/21/2023] [Accepted: 12/18/2023] [Indexed: 01/27/2024] Open
Abstract
Single-nucleotide polymorphisms (SNPs), as disease-related biogenetic markers, are crucial in elucidating complex disease susceptibility and pathogenesis. Due to computational inefficiency, it is difficult to identify high-dimensional SNP interactions efficiently using combinatorial search methods, so the spherical evolutionary multi-objective (SEMO) algorithm for detecting multi-locus SNP interactions was proposed. The algorithm uses a spherical search factor and a feedback mechanism of excellent individual history memory to enhance the balance between search and acquisition. Moreover, a multi-objective fitness function based on the decomposition idea was used to evaluate the associations by combining two functions, K2-Score and LR-Score, as an objective function for the algorithm's evolutionary iterations. The performance evaluation of SEMO was compared with six state-of-the-art algorithms on a simulated dataset. The results showed that SEMO outperforms the comparative methods by detecting SNP interactions quickly and accurately with a shorter average run time. The SEMO algorithm was applied to the Wellcome Trust Case Control Consortium (WTCCC) breast cancer dataset and detected two- and three-point SNP interactions that were significantly associated with breast cancer, confirming the effectiveness of the algorithm. New combinations of SNPs associated with breast cancer were also identified, which will provide a new way to detect SNP interactions quickly and accurately.
Collapse
Affiliation(s)
- Fuxiang Ren
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Shiyin Li
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Zihao Wen
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
- Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Yidi Liu
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
| | - Deyu Tang
- College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China; (F.R.); (S.L.); (Y.L.)
- College of Mathematics and Informatics, College of Software Engineering, South China Agricultural University, Guangzhou 510642, China
| |
Collapse
|
2
|
Yang CH, Huang HC, Hou MF, Chuang LY, Lin YD. Fuzzy-Based Multiobjective Multifactor Dimensionality Reduction for Epistasis Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:378-387. [PMID: 35061588 DOI: 10.1109/tcbb.2022.3144303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Epistasis detection is vital for understanding disease susceptibility in genetics. Multiobjective multifactor dimensionality reduction (MOMDR) was previously proposed to detect epistasis. MOMDR was performed using binary classification to distinguish the high-risk (H) and low-risk (L) groups to reduce multifactor dimensionality. However, the binary classification does not reflect the uncertainty of the H and L classification. In this study, we proposed an empirical fuzzy MOMDR (EFMOMDR) to address the limitations of binary classification using the degree of membership through an empirical fuzzy approach. The EFMOMDR can simultaneously consider two incorporated fuzzy-based measures, including correct classification rate and likelihood rate, and does not require parameter tuning. Simulation studies revealed that EFMOMDR has higher 7.14% detection success rates than MOMDR, indicating that the limitations of binary classification of MOMDR have been successfully improved by empirical fuzzy. Moreover, EFMOMDR was used to analyze coronary artery disease in the Wellcome Trust Case Control Consortium dataset.
Collapse
|
3
|
Wang H, Han H, Niu Y, Li X, Du X, Wang Q. LPP polymorphisms are risk factors for allergic rhinitis in the Chinese Han population. Cytokine 2022; 159:156027. [PMID: 36084606 DOI: 10.1016/j.cyto.2022.156027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/17/2022] [Accepted: 08/26/2022] [Indexed: 11/03/2022]
Abstract
BACKGROUND Lipoma preferred partner (LPP) polymorphisms are related to immune diseases, but the role of LPP gene in the pathogenesis of allergic rhinitis (AR) is unclear. The current study aimed to explore the contribution of LPP variants to AR susceptibility in the Chinese Han population. METHODS A total of 992 healthy controls and 992 patients with AR were recruited. Agena MassARRAY system was applied for genotyping. Odds ratios (OR) and 95% confidence intervals (CI) adjusted by age, sex, and body mass index (BMI) were calculated to conduct the risk assessment of LPP variants in people with a predisposition to AR. Additionally, multifactor dimensionality reduction (MDR) was applied to identify high-order interaction models for AR risk. RESULTS We found that rs2030519-G (p = 0.027, OR: 1.15, 95% CI: 1.02-1.31), rs6780858-G (p = 0.019, OR: 1.16, 95% CI: 1.03-1.32), and rs60946162-T (p = 0.014, OR: 1.18, 95% CI: 1.03-1.34) were associated with increased susceptibility to AR. Subgroup analyses indicated the interaction of LPP polymorphisms in terms of age, gender, and BMI with AR susceptibility (p < 0.05, OR > 1). MDR analysis revealed that rs60946162 had the information gain (0.40%) of individual attribute regarding AR. CONCLUSION Our results first determined that rs2030519, rs6780858, and rs60946162 were correlated with increased susceptibility to AR in the Chinese Han population, which add to our understanding of the impact of LPP gene variants on AR development.
Collapse
Affiliation(s)
- Haiying Wang
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China
| | - Hui Han
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China
| | - Yongliang Niu
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China
| | - Xiaobo Li
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China
| | - Xintao Du
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China
| | - Qiang Wang
- Shenmu Hospital, The Affiliated Shenmu Hospital of Northwest University, Shenmu 719300, China.
| |
Collapse
|
4
|
Matta J, Dobrino D, Yeboah D, Howard S, EL-Manzalawy Y, Obafemi-Ajayi T. Connecting phenotype to genotype: PheWAS-inspired analysis of autism spectrum disorder. Front Hum Neurosci 2022; 16:960991. [PMID: 36310845 PMCID: PMC9605200 DOI: 10.3389/fnhum.2022.960991] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 09/14/2022] [Indexed: 04/13/2024] Open
Abstract
Autism Spectrum Disorder (ASD) is extremely heterogeneous clinically and genetically. There is a pressing need for a better understanding of the heterogeneity of ASD based on scientifically rigorous approaches centered on systematic evaluation of the clinical and research utility of both phenotype and genotype markers. This paper presents a holistic PheWAS-inspired method to identify meaningful associations between ASD phenotypes and genotypes. We generate two types of phenotype-phenotype (p-p) graphs: a direct graph that utilizes only phenotype data, and an indirect graph that incorporates genotype as well as phenotype data. We introduce a novel methodology for fusing the direct and indirect p-p networks in which the genotype data is incorporated into the phenotype data in varying degrees. The hypothesis is that the heterogeneity of ASD can be distinguished by clustering the p-p graph. The obtained graphs are clustered using network-oriented clustering techniques, and results are evaluated. The most promising clusterings are subsequently analyzed for biological and domain-based relevance. Clusters obtained delineated different aspects of ASD, including differentiating ASD-specific symptoms, cognitive, adaptive, language and communication functions, and behavioral problems. Some of the important genes associated with the clusters have previous known associations to ASD. We found that clusters based on integrated genetic and phenotype data were more effective at identifying relevant genes than clusters constructed from phenotype information alone. These genes included five with suggestive evidence of ASD association and one known to be a strong candidate.
Collapse
Affiliation(s)
- John Matta
- Department of Computer Science, Southern Illinois University Edwardsville, Edwardsville, IL, United States
| | - Daniel Dobrino
- Department of Computer Science, Southern Illinois University Edwardsville, Edwardsville, IL, United States
| | - Dacosta Yeboah
- Department of Computer Science, Missouri State University, Springfield, MO, United States
| | - Swade Howard
- Department of Computer Science, Southern Illinois University Edwardsville, Edwardsville, IL, United States
| | - Yasser EL-Manzalawy
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, United States
| | - Tayo Obafemi-Ajayi
- Engineering Program, Missouri State University, Springfield, MO, United States
| |
Collapse
|
5
|
Lin YD, Lee YC, Chiang CP, Moi SH, Kan JY. MOAI: a multi-outcome interaction identification approach reveals an interaction between vaspin and carcinoembryonic antigen on colorectal cancer prognosis. Brief Bioinform 2021; 23:6398687. [PMID: 34661627 DOI: 10.1093/bib/bbab427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/14/2021] [Accepted: 09/18/2021] [Indexed: 11/12/2022] Open
Abstract
Identifying and characterizing the interaction between risk factors for multiple outcomes (multi-outcome interaction) has been one of the greatest challenges faced by complex multifactorial diseases. However, the existing approaches have several limitations in identifying the multi-outcome interaction. To address this issue, we proposed a multi-outcome interaction identification approach called MOAI. MOAI was motivated by the limitations of estimating the interaction simultaneously occurring in multi-outcomes and by the success of Pareto set filter operator for identifying multi-outcome interaction. MOAI permits the identification for the interaction of multiple outcomes and is applicable in population-based study designs. Our experimental results exhibited that the existing approaches are not effectively used to identify the multi-outcome interaction, whereas MOAI obviously exhibited superior performance in identifying multi-outcome interaction. We applied MOAI to identify the interaction between risk factors for colorectal cancer (CRC) in both metastases and mortality prognostic outcomes. An interaction between vaspin and carcinoembryonic antigen (CEA) was found, and the interaction indicated that patients with CRC characterized by higher vaspin (≥30%) and CEA (≥5) levels could simultaneously increase both metastases and mortality risk. The immunostaining evidence revealed that determined multi-outcome interaction could effectively identify the difference between non-metastases/survived and metastases/deceased patients, which offers multi-prognostic outcome risk estimation for CRC. To our knowledge, this is the first report of a multi-outcome interaction associated with a complex multifactorial disease. MOAI is freely available at https://sites.google.com/view/moaitool/home.
Collapse
Affiliation(s)
- Yu-Da Lin
- Department of Computer Science and Information Engineering, National Penghu University of Science and Technology, Magong, Penghu, 880011, Taiwan
| | - Yi-Chen Lee
- Department of Anatomy at Kaohsiung Medical University, Taiwan
| | - Chih-Po Chiang
- Division of Breast Oncology and Surgery, Department of Surgery, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80756, Taiwan
| | - Sin-Hua Moi
- Center of Cancer Program Development, E-Da Cancer Hospital, I-Shou University, Kaohsiung 824, Taiwan
| | - Jung-Yu Kan
- Division of Breast Oncology and Surgery, Department of Surgery, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80756, Taiwan
| |
Collapse
|
6
|
Yang CH, Moi SH, Chuang LY, Chen JB. Higher-order clinical risk factor interaction analysis for overall mortality in maintenance hemodialysis patients. Ther Adv Chronic Dis 2020; 11:2040622320949060. [PMID: 33062235 PMCID: PMC7534064 DOI: 10.1177/2040622320949060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 07/20/2020] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND AND AIMS In Taiwan, approximately 90% of patients with end-stage renal disease receive maintenance hemodialysis. Although studies have reported the survival predictability of multiclinical factors, the higher-order interactions among these factors have rarely been discussed. Conventional statistical approaches such as regression analysis are inadequate for detecting higher-order interactions. Therefore, this study integrated receiver operating characteristic, logistic regression, and balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction (MDR-ER) analyses to examine the impact of interaction effects between multiclinical factors on overall mortality in patients on maintenance hemodialysis. METERIALS AND METHODS In total, 781 patients who received outpatient hemodialysis dialysis three times per week before 1 January 2009 were included; their baseline clinical factor and mortality outcome data were retrospectively collected using an approved data protocol (201800595B0). RESULTS Consistent with conventional statistical approaches, the higher-order interaction model could indicate the impact of potential risk combination unique to patients on maintenance hemodialysis on the survival outcome, as described previously. Moreover, the MDR-based higher-order interaction model facilitated higher-order interaction effect detection among multiclinical factors and could determine more detailed mortality risk characteristics combinations. CONCLUSION Therefore, higher-order clinical risk interaction analysis is a reasonable strategy for detecting non-traditional risk factor interaction effects on survival outcome unique to patients on maintenance hemodialysis and thus clinically achieving whole-scale patient care.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung Ph.D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung
| | - Sin-Hua Moi
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84004
| | - Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, 123 DaPei Rd, Niao Song Dist, Kaohsiung 83301
| |
Collapse
|
7
|
Chuang LY, Yang CS, Yang HS, Yang CH. Identification of High-Order Single-Nucleotide Polymorphism Barcodes in Breast Cancer Using a Hybrid Taguchi-Genetic Algorithm: Case-Control Study. JMIR Med Inform 2020; 8:e16886. [PMID: 32554381 PMCID: PMC7351259 DOI: 10.2196/16886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 02/09/2020] [Accepted: 04/08/2020] [Indexed: 12/24/2022] Open
Abstract
Background Breast cancer has a major disease burden in the female population, and it is a highly genome-associated human disease. However, in genetic studies of complex diseases, modern geneticists face challenges in detecting interactions among loci. Objective This study aimed to investigate whether variations of single-nucleotide polymorphisms (SNPs) are associated with histopathological tumor characteristics in breast cancer patients. Methods A hybrid Taguchi-genetic algorithm (HTGA) was proposed to identify the high-order SNP barcodes in a breast cancer case-control study. A Taguchi method was used to enhance a genetic algorithm (GA) for identifying high-order SNP barcodes. The Taguchi method was integrated into the GA after the crossover operations in order to optimize the generated offspring systematically for enhancing the GA search ability. Results The proposed HTGA effectively converged to a promising region within the problem space and provided excellent SNP barcode identification. Regression analysis was used to validate the association between breast cancer and the identified high-order SNP barcodes. The maximum OR was less than 1 (range 0.870-0.755) for two- to seven-order SNP barcodes. Conclusions We systematically evaluated the interaction effects of 26 SNPs within growth factor–related genes for breast carcinogenesis pathways. The HTGA could successfully identify relevant high-order SNP barcodes by evaluating the differences between cases and controls. The validation results showed that the HTGA can provide better fitness values as compared with other methods for the identification of high-order SNP barcodes using breast cancer case-control data sets.
Collapse
Affiliation(s)
| | - Cheng-San Yang
- Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City, Taiwan
| | - Huai-Shuo Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan
| | - Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan.,Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung, Taiwan.,College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| |
Collapse
|
8
|
Yang CH, Chuang LY, Lin YD. An improved fuzzy set-based multifactor dimensionality reduction for detecting epistasis. Artif Intell Med 2020; 102:101768. [PMID: 31980105 DOI: 10.1016/j.artmed.2019.101768] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 10/18/2019] [Accepted: 11/19/2019] [Indexed: 01/07/2023]
Abstract
OBJECTIVE Epistasis identification is critical for determining susceptibility to human genetic diseases. The rapid development of technology has enabled scalability to make multifactor dimensionality reduction (MDR) measurements an effective calculation tool that achieves superior detection. However, the classification of high-risk (H) or low-risk (L) groups in multidrug resistance operations calls for extensive research. METHODS AND MATERIAL In this study, an improved fuzzy sigmoid (FS) method using the membership degree in MDR (FSMDR) was proposed for solving the limitations of binary classification. The FS method combined with MDR measurements yielded an improved ability to distinguish similar frequencies of potential multifactor genotypes. RESULTS We compared our results with other MDR-based methods and FSMDR achieved superior detection rates on simulated data sets. The results indicated that the fuzzy classifications can provide insight into the uncertainty of H/L classification in MDR operation. CONCLUSION FSMDR successfully detected significant epistasis of coronary artery disease in the Wellcome Trust Case Control Consortium data set.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, No. 415, Jiangong Rd., Sanmin Dist., Kaohsiung City, 80778, Taiwan; Ph. D. Program in Biomedical Engineering, Kaohsiung Medical University, No. 100, Shih-Chuan 1st Rd., Kaohsiung, 80708, Taiwan.
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering, I-Shou University, No.1, Sec. 1, Syuecheng Rd., Dashu District, Kaohsiung, 84001, Taiwan.
| | - Yu-Da Lin
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, No. 415, Jiangong Rd., Sanmin Dist., Kaohsiung City, 80778, Taiwan.
| |
Collapse
|