1
|
Tang DY, Mao YJ, Zhao J, Yang J, Li SY, Ren FX, Zheng J. SEEI: spherical evolution with feedback mechanism for identifying epistatic interactions. BMC Genomics 2024; 25:462. [PMID: 38735952 DOI: 10.1186/s12864-024-10373-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 05/03/2024] [Indexed: 05/14/2024] Open
Abstract
BACKGROUND Detecting epistatic interactions (EIs) involves the exploration of associations among single nucleotide polymorphisms (SNPs) and complex diseases, which is an important task in genome-wide association studies. The EI detection problem is dependent on epistasis models and corresponding optimization methods. Although various models and methods have been proposed to detect EIs, identifying EIs efficiently and accurately is still a challenge. RESULTS Here, we propose a linear mixed statistical epistasis model (LMSE) and a spherical evolution approach with a feedback mechanism (named SEEI). The LMSE model expands the existing single epistasis models such as LR-Score, K2-Score, Mutual information, and Gini index. The SEEI includes an adaptive spherical search strategy and population updating strategy, which ensures that the algorithm is not easily trapped in local optima. We analyzed the performances of 8 random disease models, 12 disease models with marginal effects, 30 disease models without marginal effects, and 10 high-order disease models. The 60 simulated disease models and a real breast cancer dataset were used to evaluate eight algorithms (SEEI, EACO, EpiACO, FDHEIW, MP-HS-DHSI, NHSA-DHSC, SNPHarvester, CSE). Three evaluation criteria (pow1, pow2, pow3), a T-test, and a Friedman test were used to compare the performances of these algorithms. The results show that the SEEI algorithm (order 1, averages ranks = 13.125) outperformed the other algorithms in detecting EIs. CONCLUSIONS Here, we propose an LMSE model and an evolutionary computing method (SEEI) to solve the optimization problem of the LMSE model. The proposed method performed better than the other seven algorithms tested in its ability to identify EIs in genome-wide association datasets. We identified new SNP-SNP combinations in the real breast cancer dataset and verified the results. Our findings provide new insights for the diagnosis and treatment of breast cancer. AVAILABILITY AND IMPLEMENTATION https://github.com/scutdy/SSO/blob/master/SEEI.zip .
Collapse
Affiliation(s)
- De-Yu Tang
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Yi-Jun Mao
- Department of Computer Science, School of Mathematics and Informatics, School of Software Engineering, South China Agricultural University, Guangzhou, 510642, PR China.
| | - Jie Zhao
- School of Management, Guangdong University of Technology, Guangzhou, 510006, PR China
| | - Jin Yang
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| | - Shi-Yin Li
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Fu-Xiang Ren
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China
| | - Junxi Zheng
- School of Medical Information and Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, PR China.
| |
Collapse
|
2
|
Moon J, Posada-Quintero HF, Chon KH. Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction. Neural Netw 2023; 165:562-595. [PMID: 37364469 DOI: 10.1016/j.neunet.2023.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 04/11/2023] [Accepted: 05/09/2023] [Indexed: 06/28/2023]
Abstract
Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
Collapse
Affiliation(s)
- Jihye Moon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| | | | - Ki H Chon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|
3
|
Yang CH, Hou MF, Chuang LY, Yang CS, Lin YD. Dimensionality reduction approach for many-objective epistasis analysis. Brief Bioinform 2023; 24:6858949. [PMID: 36458451 DOI: 10.1093/bib/bbac512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 10/07/2022] [Accepted: 10/26/2022] [Indexed: 12/04/2022] Open
Abstract
In epistasis analysis, single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) among genes may, alongside other environmental factors, influence the risk of multifactorial diseases. To identify SSI between cases and controls (i.e. binary traits), the score for model quality is affected by different objective functions (i.e. measurements) because of potential disease model preferences and disease complexities. Our previous study proposed a multiobjective approach-based multifactor dimensionality reduction (MOMDR), with the results indicating that two objective functions could enhance SSI identification with weak marginal effects. However, SSI identification using MOMDR remains a challenge because the optimal measure combination of objective functions has yet to be investigated. This study extended MOMDR to the many-objective version (i.e. many-objective MDR, MaODR) by integrating various disease probability measures based on a two-way contingency table to improve the identification of SSI between cases and controls. We introduced an objective function selection approach to determine the optimal measure combination in MaODR among 10 well-known measures. In total, 6 disease models with and 40 disease models without marginal effects were used to evaluate the general algorithms, namely those based on multifactor dimensionality reduction, MOMDR and MaODR. Our results revealed that the MaODR-based three objective function model, correct classification rate, likelihood ratio and normalized mutual information (MaODR-CLN) exhibited the higher 6.47% detection success rates (Accuracy) than MOMDR and higher 17.23% detection success rates than MDR through the application of an objective function selection approach. In a Wellcome Trust Case Control Consortium, MaODR-CLN successfully identified the significant SSIs (P < 0.001) associated with coronary artery disease. We performed a systematic analysis to identify the optimal measure combination in MaODR among 10 objective functions. Our combination detected SSIs-based binary traits with weak marginal effects and thus reduced spurious variables in the score model. MOAI is freely available at https://sites.google.com/view/maodr/home.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Information Management at the Tainan University of Technology, and at the Department of Electronic Engineering at National Kaohsiung of Science and Technology, Taiwan.,Biomedical Engineering, Kaohsiung Medical University, Taiwan
| | - Ming-Feng Hou
- Kaohsiung Medical University Hospital, and Professor at the Department of Surgery, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology and Chemical Engineering at I-Shou University, Taiwan
| | - Cheng-San Yang
- Department of Plastic Surgery, and serves as the Medical Matters Secretary of Chia-Yi Christian Hospital, Taiwan
| | - Yu-Da Lin
- Department of Computer Science and Information Engineering, and at the National Penghu University of Science and Technology, Taiwan
| |
Collapse
|
4
|
Yang CH, Huang HC, Hou MF, Chuang LY, Lin YD. Fuzzy-Based Multiobjective Multifactor Dimensionality Reduction for Epistasis Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:378-387. [PMID: 35061588 DOI: 10.1109/tcbb.2022.3144303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Epistasis detection is vital for understanding disease susceptibility in genetics. Multiobjective multifactor dimensionality reduction (MOMDR) was previously proposed to detect epistasis. MOMDR was performed using binary classification to distinguish the high-risk (H) and low-risk (L) groups to reduce multifactor dimensionality. However, the binary classification does not reflect the uncertainty of the H and L classification. In this study, we proposed an empirical fuzzy MOMDR (EFMOMDR) to address the limitations of binary classification using the degree of membership through an empirical fuzzy approach. The EFMOMDR can simultaneously consider two incorporated fuzzy-based measures, including correct classification rate and likelihood rate, and does not require parameter tuning. Simulation studies revealed that EFMOMDR has higher 7.14% detection success rates than MOMDR, indicating that the limitations of binary classification of MOMDR have been successfully improved by empirical fuzzy. Moreover, EFMOMDR was used to analyze coronary artery disease in the Wellcome Trust Case Control Consortium dataset.
Collapse
|
5
|
Ma Y, Fa B, Yuan X, Zhang Y, Yu Z. STS-BN: An efficient Bayesian network method for detecting causal SNPs. Front Genet 2022; 13:942464. [PMID: 36186431 PMCID: PMC9520706 DOI: 10.3389/fgene.2022.942464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 08/16/2022] [Indexed: 11/16/2022] Open
Abstract
Background: The identification of the causal SNPs of complex diseases in large-scale genome-wide association analysis is beneficial to the studies of pathogenesis, prevention, diagnosis and treatment of these diseases. However, existing applicable methods for large-scale data suffer from low accuracy. Developing powerful and accurate methods for detecting SNPs associated with complex diseases is highly desired. Results: We propose a score-based two-stage Bayesian network method to identify causal SNPs of complex diseases for case-control designs. This method combines the ideas of constraint-based methods and score-and-search methods to learn the structure of the disease-centered local Bayesian network. Simulation experiments are conducted to compare this new algorithm with several common methods that can achieve the same function. The results show that our method improves the accuracy and stability compared to several common methods. Our method based on Bayesian network theory results in lower false-positive rates when all correct loci are detected. Besides, real-world data application suggests that our algorithm has good performance when handling genome-wide association data. Conclusion: The proposed method is designed to identify the SNPs related to complex diseases, and is more accurate than other methods which can also be adapted to large-scale genome-wide analysis studies data.
Collapse
Affiliation(s)
- Yanran Ma
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xi’an Jiaotong University, Xi’an, China
| | - Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yue Zhang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- *Correspondence: Yue Zhang, ; Zhangsheng Yu,
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- *Correspondence: Yue Zhang, ; Zhangsheng Yu,
| |
Collapse
|
6
|
Yang X, Yang C, Lei J, Liu J. An Approach of Epistasis Detection Using Integer Linear Programming Optimizing Bayesian Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2654-2671. [PMID: 34181547 DOI: 10.1109/tcbb.2021.3092719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc. Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network (BN) has been widely used in constructing the network of SNPs and phenotype traits and thus to mine epistatic loci. However, the shortcoming of BN is that it is easy to fall into local optimum and unable to process large-scale of SNPs. In this work, we transform the problem of learning Bayesian network into the optimization of integer linear programming (ILP). We use the algorithms of branch-and-bound and cutting planes to get the global optimal Bayesian network (ILPBN), and thus to get epistatic loci influencing specific phenotype traits. In order to handle large-scale of SNP loci and further to improve efficiency, we use the method of optimizing Markov blanket to reduce the number of candidate parent nodes for each node. In addition, we use α-BIC that is suitable for processing the epistatis mining to calculate the BN score. We use four properties of BN decomposable scoring functions to further reduce the number of candidate parent sets for each node. Experiment results show that ILPBN can not only process 2-locus and 3-locus epistasis mining, but also realize multi-locus epistasis detection. Finally, we compare ILPBN with several popular epistasis mining algorithms by using simulated and real Age-related macular disease (AMD) dataset. Experiment results show that ILPBN has better epistasis detection accuracy, F1-score and false positive rate in premise of ensuring the efficiency compared with other methods. Availability: Codes and dataset are available at: http://122.205.95.139/ILPBN/.
Collapse
|
7
|
Wang X, Cao X, Feng Y, Guo M, Yu G, Wang J. ELSSI: parallel SNP-SNP interactions detection by ensemble multi-type detectors. Brief Bioinform 2022; 23:6607749. [PMID: 35696639 DOI: 10.1093/bib/bbac213] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/18/2022] [Accepted: 05/07/2022] [Indexed: 12/11/2022] Open
Abstract
With the development of high-throughput genotyping technology, single nucleotide polymorphism (SNP)-SNP interactions (SSIs) detection has become an essential way for understanding disease susceptibility. Various methods have been proposed to detect SSIs. However, given the disease complexity and bias of individual SSI detectors, these single-detector-based methods are generally unscalable for real genome-wide data and with unfavorable results. We propose a novel ensemble learning-based approach (ELSSI) that can significantly reduce the bias of individual detectors and their computational load. ELSSI randomly divides SNPs into different subsets and evaluates them by multi-type detectors in parallel. Particularly, ELSSI introduces a four-stage pipeline (generate, score, switch and filter) to iteratively generate new SNP combination subsets from SNP subsets, score the combination subset by individual detectors, switch high-score combinations to other detectors for re-scoring, then filter out combinations with low scores. This pipeline makes ELSSI able to detect high-order SSIs from large genome-wide datasets. Experimental results on various simulated and real genome-wide datasets show the superior efficacy of ELSSI to state-of-the-art methods in detecting SSIs, especially for high-order ones. ELSSI is applicable with moderate PCs on the Internet and flexible to assemble new detectors. The code of ELSSI is available at https://www.sdu-idea.cn/codes.php?name=ELSSI.
Collapse
Affiliation(s)
- Xin Wang
- School of Software, Shandong University, Jinan 250101, China.,Joint SDU-NTU Centre for Artificial Intelligence Research(C-FAIR), Shandong University, Jinan 250101, China
| | - Xia Cao
- College of Computer and Information Sciences, Southwest University, Chongqing 400715, China
| | - Yuantao Feng
- College of Computer and Information Sciences, Southwest University, Chongqing 400715, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
| | - Guoxian Yu
- School of Software, Shandong University, Jinan 250101, China
| | - Jun Wang
- Joint SDU-NTU Centre for Artificial Intelligence Research(C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
8
|
Wang J, Zhang H, Ren W, Guo M, Yu G. EpiMC: Detecting Epistatic Interactions Using Multiple Clusterings. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:243-254. [PMID: 33989157 DOI: 10.1109/tcbb.2021.3080462] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Detecting single nucleotide polymorphisms (SNPs) interactions is crucial to identify susceptibility genes associated with complex human diseases in genome-wide association studies. Clustering-based approaches are widely used in reducing search space and exploring potential relationships between SNPs in epistasis analysis. However, these approaches all only use a single measure to filter out nonsignificant SNP combinations, which may be significant ones from another perspective. In this paper, we propose a two-stage approach named EpiMC (Epistatic Interactions detection based on Multiple Clusterings) that employs multiple clusterings to obtain more precise candidate sets and more comprehensively detect high-order interactions based on these sets. In the first stage, EpiMC proposes a matrix factorization based multiple clusterings algorithm to generate multiple diverse clusterings, each of which divide all SNPs into different clusters. This stage aims to reduce the chance of filtering out potential candidates overlooked by a single clustering and groups associated SNPs together from different clustering perspectives. In the next stage, EpiMC considers both the single-locus effects and interaction effects to select high-quality disease associated SNPs, and then uses Jaccard similarity to get candidate sets. Finally, EpiMC uses exhaustive search on the obtained small candidate sets to precisely detect epsitatic interactions. Extensive simulation experiments show that EpiMC has a better performance in detecting high-order interactions than state-of-the-art solutions. On the Wellcome Trust Case Control Consortium (WTCCC) dataset, EpiMC detects several significant epistatic interactions associated with breast cancer (BC) and age-related macular degeneration (AMD), which again corroborate the effectiveness of EpiMC.
Collapse
|
9
|
Lin YD, Lee YC, Chiang CP, Moi SH, Kan JY. MOAI: a multi-outcome interaction identification approach reveals an interaction between vaspin and carcinoembryonic antigen on colorectal cancer prognosis. Brief Bioinform 2021; 23:6398687. [PMID: 34661627 DOI: 10.1093/bib/bbab427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/14/2021] [Accepted: 09/18/2021] [Indexed: 11/12/2022] Open
Abstract
Identifying and characterizing the interaction between risk factors for multiple outcomes (multi-outcome interaction) has been one of the greatest challenges faced by complex multifactorial diseases. However, the existing approaches have several limitations in identifying the multi-outcome interaction. To address this issue, we proposed a multi-outcome interaction identification approach called MOAI. MOAI was motivated by the limitations of estimating the interaction simultaneously occurring in multi-outcomes and by the success of Pareto set filter operator for identifying multi-outcome interaction. MOAI permits the identification for the interaction of multiple outcomes and is applicable in population-based study designs. Our experimental results exhibited that the existing approaches are not effectively used to identify the multi-outcome interaction, whereas MOAI obviously exhibited superior performance in identifying multi-outcome interaction. We applied MOAI to identify the interaction between risk factors for colorectal cancer (CRC) in both metastases and mortality prognostic outcomes. An interaction between vaspin and carcinoembryonic antigen (CEA) was found, and the interaction indicated that patients with CRC characterized by higher vaspin (≥30%) and CEA (≥5) levels could simultaneously increase both metastases and mortality risk. The immunostaining evidence revealed that determined multi-outcome interaction could effectively identify the difference between non-metastases/survived and metastases/deceased patients, which offers multi-prognostic outcome risk estimation for CRC. To our knowledge, this is the first report of a multi-outcome interaction associated with a complex multifactorial disease. MOAI is freely available at https://sites.google.com/view/moaitool/home.
Collapse
Affiliation(s)
- Yu-Da Lin
- Department of Computer Science and Information Engineering, National Penghu University of Science and Technology, Magong, Penghu, 880011, Taiwan
| | - Yi-Chen Lee
- Department of Anatomy at Kaohsiung Medical University, Taiwan
| | - Chih-Po Chiang
- Division of Breast Oncology and Surgery, Department of Surgery, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80756, Taiwan
| | - Sin-Hua Moi
- Center of Cancer Program Development, E-Da Cancer Hospital, I-Shou University, Kaohsiung 824, Taiwan
| | - Jung-Yu Kan
- Division of Breast Oncology and Surgery, Department of Surgery, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80756, Taiwan
| |
Collapse
|
10
|
Wang X, Zhang H, Wang J, Yu G, Cui L, Guo M. EpiHNet: Detecting epistasis by heterogeneous molecule network. Methods 2021; 198:65-75. [PMID: 34555529 DOI: 10.1016/j.ymeth.2021.09.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/16/2021] [Accepted: 09/16/2021] [Indexed: 12/22/2022] Open
Abstract
Epistasis between single nucleotide polymorphisms (SNPs) plays an important role in elucidating the missing heritability of complex diseases. Diverse approaches have been invented for detecting SNP interactions, but they canonically neglect the important and useful connections between SNPs and other bio-molecules (i.e., miRNAs and lncRNAs). To comprehensively model these disease related molecules, a heterogeneous bio-molecular network based solution EpiHNet is introduced for high-order SNP interactions detection. EpiHNet firstly uses case/control data to construct an SNP statistical network, and meta-path based similarity on the heterogeneous network composed with SNPs, genes, lncRNAs, miRNAs and diseases to define another SNP relational network. The SNP relational network can explore and exploit different associations between molecules and diseases to complement the SNP statistical network and search the significantly associated SNPs. Next, EpiHNet integrates these two networks into a composite network, applies the modularity based clustering with fast search strategy to divide SNP nodes into different clusters. After that, it detects SNP interactions based on SNP combinations derived from each cluster. Synthetic experiments on diverse two-locus and three-locus disease models manifest that EpiHNet outperforms competitive baselines, even without the heterogeneous network. For real WTCCC breast cancer data, EpiHNet also demonstrates expressive results on detecting high-order SNP interactions.
Collapse
Affiliation(s)
- Xin Wang
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Huiling Zhang
- College of Computer and Information Sciences, Southwest University, Chongqing, China.
| | - Jun Wang
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Guoxian Yu
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Lizhen Cui
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Maozu Guo
- College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| |
Collapse
|
11
|
Wang L, Wang Y, Fu Y, Gao Y, Du J, Yang C, Liu J. AFSBN: A Method of Artificial Fish Swarm Optimizing Bayesian Network for Epistasis Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1369-1383. [PMID: 31670676 DOI: 10.1109/tcbb.2019.2949780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
How to mine the interaction between SNPs (namely epistasis) efficiently and accurately must be considered when to tackle the complexity of underlying biological mechanisms. In order to overcome the defect of low learning efficiency and local optimal, this work proposes an epistasis mining method using artificial fish swarm optimizing Bayesian network (AFSBN). This method uses the characteristics of global optimization, good robustness and fast convergence about the artificial fish swarm algorithm, and uses the algorithm into the heuristic search strategy of Bayesian network. The initial network structure can be evolved through the manipulations of foraging behavior, clustering behavior, tail-chasing behavior and random behavior. This algorithm chooses different behaviors to modify the network state according to the changing of surrounding environment and the states of partners. It realizes the interaction between each artificial fish and its neighboring environment, and finally finds the optimal network in the population. We compared AFSBN with other existing algorithms on both simulated and real datasets. The experimental results demonstrate that our method outperforms others in epistasis detection accuracy in the case of not affecting the efficiency basically for different datasets.
Collapse
|
12
|
Qiu S, Sun J. lncRNA-MALAT1 expression in patients with coronary atherosclerosis and its predictive value for in-stent restenosis. Exp Ther Med 2020; 20:129. [PMID: 33082861 DOI: 10.3892/etm.2020.9258] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 08/07/2020] [Indexed: 01/07/2023] Open
Abstract
This study was designed to investigate the long non-coding RNA (lncRNA)-metastasis associated lung adenocarcinoma transcript 1 (MALAT1) expression in patients with coronary atherosclerosis and its predictive value for in-stent restenosis. Ninety-five patients with coronary heart disease who came to our hospital for treatment and underwent stent implantation were selected as a research group (RG), and 95 volunteers undergoing physical examination who did not suffer from coronary heart disease during the same period were selected as a control group (CG). MALAT1 of subjects in both groups before and after treatment were detected by RT-qPCR, and N-terminal pro-brain natriuretic peptide (NT-proBNP), high sensitivity C-reactive protein (hs-CRP), lactate dehydrogenase (LDH), and creatine kinase isoenzyme (CK-MB) of them in the RG before treatment were detected. The level was evaluated and detected, and its correlation with MALAT1 was analyzed. Then, the predictive value of MALAT1 for in-stent restenosis in patients with coronary heart disease was analyzed. MALAT1 expression in patients with coronary heart disease was higher than that of normal subjects (P<0.05); after treatment, the expression levels of MALAT1, NT-proBNP, hs-CRP, LDH, and CK-MB in the serum of patients were significantly lower than those before treatment (P<0.05); MALAT1 expression was positively correlated with the expression levels of NT-proBNP, hs-CRP, LDH, and CK-MB (P<0.05). Receiver operating characteristic of MALAT1 for predicting in-stent restenosis in patients with coronary heart disease was over 0.8; the number of lesions, MALAT1, diabetes, NT-proBNP and hs-CRP were independent risk factors for in-stent restenosis. MALAT1 is highly expressed in the serum of patients with coronary heart disease, and it has high value in its diagnosis and the prediction of in-stent restenosis. It is also an independent risk factor for in-stent restenosis in patients with coronary heart disease.
Collapse
Affiliation(s)
- Shi Qiu
- Department of Cardiovascular Surgery, The Second Hospital of Shandong University, Jinan, Shandong 250000, P.R. China
| | - Jinhui Sun
- Department of Cardiovascular Surgery, The Second Hospital of Shandong University, Jinan, Shandong 250000, P.R. China
| |
Collapse
|
13
|
Yang CH, Moi SH, Chuang LY, Chen JB. Higher-order clinical risk factor interaction analysis for overall mortality in maintenance hemodialysis patients. Ther Adv Chronic Dis 2020; 11:2040622320949060. [PMID: 33062235 PMCID: PMC7534064 DOI: 10.1177/2040622320949060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 07/20/2020] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND AND AIMS In Taiwan, approximately 90% of patients with end-stage renal disease receive maintenance hemodialysis. Although studies have reported the survival predictability of multiclinical factors, the higher-order interactions among these factors have rarely been discussed. Conventional statistical approaches such as regression analysis are inadequate for detecting higher-order interactions. Therefore, this study integrated receiver operating characteristic, logistic regression, and balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor-dimensionality reduction (MDR-ER) analyses to examine the impact of interaction effects between multiclinical factors on overall mortality in patients on maintenance hemodialysis. METERIALS AND METHODS In total, 781 patients who received outpatient hemodialysis dialysis three times per week before 1 January 2009 were included; their baseline clinical factor and mortality outcome data were retrospectively collected using an approved data protocol (201800595B0). RESULTS Consistent with conventional statistical approaches, the higher-order interaction model could indicate the impact of potential risk combination unique to patients on maintenance hemodialysis on the survival outcome, as described previously. Moreover, the MDR-based higher-order interaction model facilitated higher-order interaction effect detection among multiclinical factors and could determine more detailed mortality risk characteristics combinations. CONCLUSION Therefore, higher-order clinical risk interaction analysis is a reasonable strategy for detecting non-traditional risk factor interaction effects on survival outcome unique to patients on maintenance hemodialysis and thus clinically achieving whole-scale patient care.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung Ph.D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung
| | - Sin-Hua Moi
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84004
| | - Jin-Bor Chen
- Division of Nephrology, Department of Internal Medicine, Kaohsiung Chang Gung Memorial Hospital and Chang Gung University College of Medicine, 123 DaPei Rd, Niao Song Dist, Kaohsiung 83301
| |
Collapse
|
14
|
Toxo: a library for calculating penetrance tables of high-order epistasis models. BMC Bioinformatics 2020; 21:138. [PMID: 32272874 PMCID: PMC7147067 DOI: 10.1186/s12859-020-3456-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 03/18/2020] [Indexed: 12/12/2022] Open
Abstract
Background Epistasis is defined as the interaction between different genes when expressing a specific phenotype. The most common way to characterize an epistatic relationship is using a penetrance table, which contains the probability of expressing the phenotype under study given a particular allele combination. Available simulators can only create penetrance tables for well-known epistasis models involving a small number of genes and under a large number of limitations. Results Toxo is a MATLAB library designed to calculate penetrance tables of epistasis models of any interaction order which resemble real data more closely. The user specifies the desired heritability (or prevalence) and the program maximizes the table’s prevalence (or heritability) according to the input epistatic model boundaries. Conclusions Toxo extends the capabilities of existing simulators that define epistasis using penetrance tables. These tables can be directly used as input for software simulators such as GAMETES so that they are able to generate data samples with larger interactions and more realistic prevalences/heritabilities.
Collapse
|
15
|
Cao X, Yu G, Ren W, Guo M, Wang J. DualWMDR: Detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum Mutat 2019; 41:719-734. [DOI: 10.1002/humu.23951] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/10/2019] [Accepted: 11/07/2019] [Indexed: 12/14/2022]
Affiliation(s)
- Xia Cao
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Guoxian Yu
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Wei Ren
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Maozu Guo
- School of Electrical and Information EngineeringBeijing University of Civil Engineering and ArchitectureBeijing China
- Beijing Key Laboratory of Intelligent Processing for Building Big DataBeijing China
| | - Jun Wang
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| |
Collapse
|
16
|
Hu W, Ding H, Ouyang A, Zhang X, Xu Q, Han Y, Zhang X, Jin Y. LncRNA MALAT1 gene polymorphisms in coronary artery disease: a case-control study in a Chinese population. Biosci Rep 2019; 39:BSR20182213. [PMID: 30833365 PMCID: PMC6422883 DOI: 10.1042/bsr20182213] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 02/19/2019] [Accepted: 03/01/2019] [Indexed: 12/12/2022] Open
Abstract
Background: Coronary artery disease (CAD) is one of the main fatal diseases all over the world. CAD is a complex disease, which has multiple risk factors mechanisms. In recent years, genome-wide association study (GWAS) had revealed single nucleotide polymorphism genes (SNPs) which were closely related with CAD risks. The relationship between long non-coding RNA (lncRNA) MALAT1 (metastasis-associated lung adenocarcinoma transcript 1) and CAD risk is largely unknown. To our knowledge, this is the first study which demonstrated the interaction effects of SNP-SNP and SNP-environment with CAD risk. In general, our case-control study is to detect the association between MALAT1 (rs619586, rs4102217) SNPs and CAD risk. Methods: Three hundred and sixty-five CAD patients and three hundred and eighty-four matched control participants blood samples were collected in Liaoning province, China. Two polymorphisms (rs619586, rs4102217) in lncRNA MALAT1 were genotyped by KASP platform. Results: In a stratified analysis, we found that non-drinkers with GC genotype and the recessive model of rs4102217 had higher CAD risk (P=0.010, odds ratio (OR): 1.96, 95% confidence interval (CI) = 1.17-3.28; P=0.026, OR: 1.73, 95% CI = 1.07-2.79) and diabetes mellitus (DM) history group (P=0.010, OR: 4.07, 95% CI = 1.41-11.81; P=0.019, OR: 3.29, 95% CI = 1.22-8.88). In SNP-SNP interactions analysis between MALAT1 and CAD risk, we found rs4102217 had an increase in smokers (GG: OR: 2.04, 95% CI = 1.42-2.92; CC+GC: OR: 2.64, 95% CI = 1.64-4.26) and a decrease in drinkers (CC+GC: OR: 0.33, 95% CI = 0.20-0.55). Smokers with MALAT1 rs619586 AA genotype (OR: 2.20, 95% CI = 1.57-3.07) and GG+AG genotype (OR: 2.11, 95% CI = 1.17-3.81) had a higher risk of CAD. Moreover, drinkers with AA genotype (OR: 0.22, 95% CI = 0.10-0.48) and GG+AG genotype (OR: 0.38, 95% CI = 0.22-0.65) had a lower risk of CAD. According to the MDR software, MALAT1 rs4102217 polymorphism-smoking-drinking was the best interaction model, which has higher risk of CAD (Testing Bal.ACC. = 0.6979). Conclusion: Our study demonstrated that the GC genotype and the recessive model of rs4102217 potentially increased CAD risk in some specific group.
Collapse
Affiliation(s)
- Weina Hu
- The Department of Cardiology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110034, China
| | - Hanxi Ding
- The First Affiliated Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention (China Medical University), Liaoning Provincial Education Department, Shenyang 110001, China
| | - An Ouyang
- Department of Kinesiology and Health Promotion, University of Kentucky, Lexington, KY 40506, U.S.A
| | - Xiaohong Zhang
- The Department of Cardiology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110034, China
| | - Qian Xu
- The First Affiliated Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention (China Medical University), Liaoning Provincial Education Department, Shenyang 110001, China
| | - Yunan Han
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110, U.S.A
| | - Xueying Zhang
- The Department of Cardiology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110034, China
| | - Yuanzhe Jin
- The Department of Cardiology, The Fourth Affiliated Hospital of China Medical University, Shenyang 110034, China
| |
Collapse
|
17
|
Tsai SJ, Lin E, Kuo PH, Liu YL, Yang A. A gene–gene interaction between the vascular endothelial growth factor a and brain-derived neurotrophic factor genes is associated with psychological distress in the Taiwanese population. TAIWANESE JOURNAL OF PSYCHIATRY 2019. [DOI: 10.4103/tpsy.tpsy_30_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|
18
|
Tsai SJ, Lin E, Kuo PH, Liu YL, Yang A. A gene-based analysis of variants in the Brain-derived Neurotrophic Factor gene with psychological distress in a Taiwanese population. TAIWANESE JOURNAL OF PSYCHIATRY 2019. [DOI: 10.4103/tpsy.tpsy_6_19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
|