1
|
Chen R, Xie G, Lin Z, Gu G, Yu Y, Yu J, Liu Z. Predicting Microbe-Disease Associations Based on a Linear Neighborhood Label Propagation Method with Multi-order Similarity Fusion Learning. Interdiscip Sci 2024; 16:345-360. [PMID: 38436840 DOI: 10.1007/s12539-024-00607-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/04/2024] [Accepted: 01/05/2024] [Indexed: 03/05/2024]
Abstract
Computational approaches employed for predicting potential microbe-disease associations often rely on similarity information between microbes and diseases. Therefore, it is important to obtain reliable similarity information by integrating multiple types of similarity information. However, existing similarity fusion methods do not consider multi-order fusion of similarity networks. To address this problem, a novel method of linear neighborhood label propagation with multi-order similarity fusion learning (MOSFL-LNP) is proposed to predict potential microbe-disease associations. Multi-order fusion learning comprises two parts: low-order global learning and high-order feature learning. Low-order global learning is used to obtain common latent features from multiple similarity sources. High-order feature learning relies on the interactions between neighboring nodes to identify high-order similarities and learn deeper interactive network structures. Coefficients are assigned to different high-order feature learning modules to balance the similarities learned from different orders and enhance the robustness of the fusion network. Overall, by combining low-order global learning with high-order feature learning, multi-order fusion learning can capture both the shared and unique features of different similarity networks, leading to more accurate predictions of microbe-disease associations. In comparison to six other advanced methods, MOSFL-LNP exhibits superior prediction performance in the leave-one-out cross-validation and 5-fold validation frameworks. In the case study, the predicted 10 microbes associated with asthma and type 1 diabetes have an accuracy rate of up to 90% and 100%, respectively.
Collapse
Affiliation(s)
- Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Yi Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhenguo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
2
|
Peng W, Liu M, Dai W, Chen T, Fu Y, Pan Y. Multi-View Feature Aggregation for Predicting Microbe-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2748-2758. [PMID: 34871177 DOI: 10.1109/tcbb.2021.3132611] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes play a crucial role in human health and disease. Figuring out the relationship between microbes and diseases leads to significant potential applications in disease treatments. It is an urgent need to devise robust and effective computational methods for identifying disease-related microbes. This work proposes a Multi-View Feature Aggregation (MVFA) scheme that integrates the linear and nonlinear features to identify disease-related microbes. We introduce a non-negative matrix tri-factorization (NMTF) model to extract linear features for diseases and microbes. Then we learn another type of linear feature by utilizing a bi-random walk model. The nonlinear feature is obtained by inputting the two kinds of linear features into a capsule neural network. These three types of features describe the associations between diseases and microbes from different views. Finally, considering the complementary of these features, we leverage a logistic regression model to combine the NMTF model predictions, bi-random walk model predictions, and the capsule neural network predictions to obtain the final microbe-disease pair scores. We apply our method to predict human microbe-disease associations on two datasets. Experimental results show that our multi-view model outperforms the state-of-the-art models in recovering missing microbe-disease associations and predicting associations for new microbes. The ablation study shows that aggregating multi-view linear and nonlinear features can improve the prediction performance. Case studies on two diseases, i.e. Type 1 diabetes and Liver cirrhosis, further validate our method effectiveness.
Collapse
|
3
|
Wang F, Yang H, Wu Y, Peng L, Li X. SAELGMDA: Identifying human microbe-disease associations based on sparse autoencoder and LightGBM. Front Microbiol 2023; 14:1207209. [PMID: 37415823 PMCID: PMC10320730 DOI: 10.3389/fmicb.2023.1207209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/18/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. Methods Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. Results The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. Conclusion We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs.
Collapse
Affiliation(s)
- Feixiang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Huandong Yang
- Department of Gastrointestinal Surgery, Yidu Central Hospital of Weifang, Weifang, China
| | - Yan Wu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiaoling Li
- The Second Department of Oncology, Beidahuang Industry Group General Hospital, Harbin, China
- The Second Department of Oncology, Heilongjiang Second Cancer Hospital, Harbin, China
| |
Collapse
|
4
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
5
|
Jiang C, Tang M, Jin S, Huang W, Liu X. KGNMDA: A Knowledge Graph Neural Network Method for Predicting Microbe-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1147-1155. [PMID: 35724280 DOI: 10.1109/tcbb.2022.3184362] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Accumulated studies discovered that various microbes in human bodies were closely related to complex human diseases and could provide new insight into drug development. Multiple computational methods were constructed to predict microbes that were potentially associated with diseases. However, most previous methods were based on single characteristics of microbes or diseases, that lacked important biological information related to microorganisms or diseases. Therefore, we constructed a knowledge graph centered on microorganisms and diseases from several existed databases to provide knowledgeable information for microbes and diseases. Then, we adopted a graph neural network method to learn representations of microbes and diseases from the constructed knowledge graph. After that, we introduced the Gaussian kernel similarity features of microbes and diseases to generate final representations of microbes and diseases. At last, we proposed a score function on final representations of microbes and diseases to predict scores of microbe-disease associations. Comprehensive experiments on the Human Microbe-Disease Association Database (HMDAD) dataset had demonstrated that our approach outperformed baseline methods. Furthermore, we implemented case studies on two important diseases (asthma and inflammatory bowel disease), the result demonstrated that our proposed model was effective in revealing the relationship between diseases and microbes. The source code of our model and the data were available on https://github.com/ChangzhiJiang/KGNMDA_master.
Collapse
|
6
|
Hu W, Yang X, Wang L, Zhu X. MADGAN:A microbe-disease association prediction model based on generative adversarial networks. Front Microbiol 2023; 14:1159076. [PMID: 37032881 PMCID: PMC10076708 DOI: 10.3389/fmicb.2023.1159076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 03/02/2023] [Indexed: 04/11/2023] Open
Abstract
Researches have demonstrated that microorganisms are indispensable for the nutrition transportation, growth and development of human bodies, and disorder and imbalance of microbiota may lead to the occurrence of diseases. Therefore, it is crucial to study relationships between microbes and diseases. In this manuscript, we proposed a novel prediction model named MADGAN to infer potential microbe-disease associations by combining biological information of microbes and diseases with the generative adversarial networks. To our knowledge, it is the first attempt to use the generative adversarial network to complete this important task. In MADGAN, we firstly constructed different features for microbes and diseases based on multiple similarity metrics. And then, we further adopted graph convolution neural network (GCN) to derive different features for microbes and diseases automatically. Finally, we trained MADGAN to identify latent microbe-disease associations by games between the generation network and the decision network. Especially, in order to prevent over-smoothing during the model training process, we introduced the cross-level weight distribution structure to enhance the depth of the network based on the idea of residual network. Moreover, in order to validate the performance of MADGAN, we conducted comprehensive experiments and case studies based on databases of HMDAD and Disbiome respectively, and experimental results demonstrated that MADGAN not only achieved satisfactory prediction performances, but also outperformed existing state-of-the-art prediction models.
Collapse
Affiliation(s)
- Weixin Hu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
| | - Xiaoyu Yang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
| | - Lei Wang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
- *Correspondence: Lei Wang,
| | - Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
- Xianyou Zhu,
| |
Collapse
|
7
|
Hua M, Yu S, Liu T, Yang X, Wang H. MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes. Interdiscip Sci 2022; 14:669-682. [PMID: 35428964 DOI: 10.1007/s12539-022-00514-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/06/2022] [Accepted: 03/13/2022] [Indexed: 06/14/2023]
Abstract
MOTIVATION Exploring the interrelationships between microbes and disease can help microbiologists make decisions and plan treatments. Predicting new microbe-disease associations currently relies on biological experiments and domain knowledge, which is time-consuming and inefficient. Automated algorithms are used to uncover the intrinsic link between microbes and disease. However, due to data noise and inadequate understanding of relevant biology, the efficient prediction of microbe-disease associations is still crucial. This study develops a multi-view graph augmentation convolutional network (MVGCNMDA) to predict potential disease-associated microbes. METHODS First, we use two data augmentation methods, edge perturbation and node dropping, to remove the data noise in the preprocessing stage. Second, we calculate Gaussian interaction profile kernel similarity and cosine similarity. Therefore, the Graph Convolutional Network(GCN) can fully use multi-view features. Then, the multi-view features are fed into the multi-attention block to learn the weights of different features adaptively. Finally, the embedding results are obtained using a Convolutional Neural Network (CNN) combiner, and the matrix completion is used to predict the relationship between potential microbes and diseases. RESULTS We test our model on the Human microbe-disease Association Database (HMDAD), Disbiome, and the Combined Dataset (Peryton and MicroPhenoDB). The area under PR curve (AUPR), area under ROC curve (AUC), F1 score, and RECALL value are calculated to evaluate the performance of the developed MVGCNMDA. The AUPR is 0.9440, AUC is 0.9428, F1 score is 0.9383, and RECALL value is 0.8858. The experiments show that our model can accurately predict potential microbe-disease associations compared with the state-of-the-art works on the global Leave-One-Out-Cross-Validation (LOOCV) and the fivefold Cross-Validation (fivefold CV). To further verify the effectiveness of the proposed graph data augmentation, we designed five different settings in the ablation study. Furthermore, we present two case studies that validate the prediction of the potential association between microbes and diseases by MVGCNMDA.
Collapse
Affiliation(s)
- Meifang Hua
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Tianyu Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Xue Yang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
8
|
Chen Y, Lei X. Metapath Aggregated Graph Neural Network and Tripartite Heterogeneous Networks for Microbe-Disease Prediction. Front Microbiol 2022; 13:919380. [PMID: 35711758 PMCID: PMC9194683 DOI: 10.3389/fmicb.2022.919380] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 04/29/2022] [Indexed: 11/25/2022] Open
Abstract
More and more studies have shown that understanding microbe-disease associations cannot only reveal the pathogenesis of diseases, but also promote the diagnosis and prognosis of diseases. Because traditional medical experiments are time-consuming and expensive, many computational methods have been proposed in recent years to identify potential microbe-disease associations. In this study, we propose a method based on heterogeneous network and metapath aggregated graph neural network (MAGNN) to predict microbe-disease associations, called MATHNMDA. First, we introduce microbe-drug interactions, drug-disease associations, and microbe-disease associations to construct a microbe-drug-disease heterogeneous network. Then we take the heterogeneous network as input to MAGNN. Second, for each layer of MAGNN, we carry out intra-metapath aggregation with a multi-head attention mechanism to learn the structural and semantic information embedded in the target node context, the metapath-based neighbor nodes, and the context between them, by encoding the metapath instances under the metapath definition mode. We then use inter-metapath aggregation with an attention mechanism to combine the semantic information of all different metapaths. Third, we can get the final embedding of microbe nodes and disease nodes based on the output of the last layer in the MAGNN. Finally, we predict potential microbe-disease associations by reconstructing the microbe-disease association matrix. In addition, we evaluated the performance of MATHNMDA by comparing it with that of its variants, some state-of-the-art methods, and different datasets. The results suggest that MATHNMDA is an effective prediction method. The case studies on asthma, inflammatory bowel disease (IBD), and coronavirus disease 2019 (COVID-19) further validate the effectiveness of MATHNMDA.
Collapse
Affiliation(s)
- Yali Chen
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
9
|
Zhu B, Xu Y, Zhao P, Yiu SM, Yu H, Shi JY. NNAN: Nearest Neighbor Attention Network to Predict Drug–Microbe Associations. Front Microbiol 2022; 13:846915. [PMID: 35479616 PMCID: PMC9035839 DOI: 10.3389/fmicb.2022.846915] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
Many drugs can be metabolized by human microbes; the drug metabolites would significantly alter pharmacological effects and result in low therapeutic efficacy for patients. Hence, it is crucial to identify potential drug–microbe associations (DMAs) before the drug administrations. Nevertheless, traditional DMA determination cannot be applied in a wide range due to the tremendous number of microbe species, high costs, and the fact that it is time-consuming. Thus, predicting possible DMAs in computer technology is an essential topic. Inspired by other issues addressed by deep learning, we designed a deep learning-based model named Nearest Neighbor Attention Network (NNAN). The proposed model consists of four components, namely, a similarity network constructor, a nearest-neighbor aggregator, a feature attention block, and a predictor. In brief, the similarity block contains a microbe similarity network and a drug similarity network. The nearest-neighbor aggregator generates the embedding representations of drug–microbe pairs by integrating drug neighbors and microbe neighbors of each drug–microbe pair in the network. The feature attention block evaluates the importance of each dimension of drug–microbe pair embedding by a set of ordinary multi-layer neural networks. The predictor is an ordinary fully-connected deep neural network that functions as a binary classifier to distinguish potential DMAs among unlabeled drug–microbe pairs. Several experiments on two benchmark databases are performed to evaluate the performance of NNAN. First, the comparison with state-of-the-art baseline approaches demonstrates the superiority of NNAN under cross-validation in terms of predicting performance. Moreover, the interpretability inspection reveals that a drug tends to associate with a microbe if it finds its top-l most similar neighbors that associate with the microbe.
Collapse
Affiliation(s)
- Bei Zhu
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, China
| | - Yi Xu
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, China
| | - Pengcheng Zhao
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, China
| | - Siu-Ming Yiu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Hui Yu
- School of Computer Science, Northwestern Polytechnical University, Xi’an, China
- *Correspondence: Hui Yu,
| | - Jian-Yu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, China
- Jian-Yu Shi,
| |
Collapse
|
10
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
11
|
Xu D, Xu H, Zhang Y, Gao R. Novel Collaborative Weighted Non-negative Matrix Factorization Improves Prediction of Disease-Associated Human Microbes. Front Microbiol 2022; 13:834982. [PMID: 35369503 PMCID: PMC8965656 DOI: 10.3389/fmicb.2022.834982] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 01/19/2022] [Indexed: 12/14/2022] Open
Abstract
Extensive clinical and biomedical studies have shown that microbiome plays a prominent role in human health. Identifying potential microbe–disease associations (MDAs) can help reveal the pathological mechanism of human diseases and be useful for the prevention, diagnosis, and treatment of human diseases. Therefore, it is necessary to develop effective computational models and reduce the cost and time of biological experiments. Here, we developed a novel machine learning-based joint framework called CWNMF-GLapRLS for human MDA prediction using the proposed collaborative weighted non-negative matrix factorization (CWNMF) technique and graph Laplacian regularized least squares. Especially, to fuse more similarity information, we calculated the functional similarity of microbes. To deal with missing values and effectively overcome the data sparsity problem, we proposed a collaborative weighted NMF technique to reconstruct the original association matrix. In addition, we developed a graph Laplacian regularized least-squares method for prediction. The experimental results of fivefold and leave-one-out cross-validation demonstrated that our method achieved the best performance by comparing it with 5 state-of-the-art methods on the benchmark dataset. Case studies further showed that the proposed method is an effective tool to predict potential MDAs and can provide more help for biomedical researchers.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
- *Correspondence: Yusen Zhang,
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, China
- Rui Gao,
| |
Collapse
|
12
|
Ghulam A, Lei X, Zhang Y, Wu Z. Human Drug-Pathway Association Prediction Based on Network Consistency Projection. Comput Biol Chem 2022; 97:107624. [DOI: 10.1016/j.compbiolchem.2022.107624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 11/24/2021] [Accepted: 01/05/2022] [Indexed: 11/26/2022]
|
13
|
Ma Y, Liu L, Chen Q, Ma Y. An Inductive Logistic Matrix Factorization Model for Predicting Drug-Metabolite Association With Vicus Regularization. Front Microbiol 2021; 12:650366. [PMID: 33868209 PMCID: PMC8047063 DOI: 10.3389/fmicb.2021.650366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 03/08/2021] [Indexed: 11/28/2022] Open
Abstract
Metabolites are closely related to human disease. The interaction between metabolites and drugs has drawn increasing attention in the field of pharmacomicrobiomics. However, only a small portion of the drug-metabolite interactions were experimentally observed due to the fact that experimental validation is labor-intensive, costly, and time-consuming. Although a few computational approaches have been proposed to predict latent associations for various bipartite networks, such as miRNA-disease, drug-target interaction networks, and so on, to our best knowledge the associations between drugs and metabolites have not been reported on a large scale. In this study, we propose a novel algorithm, namely inductive logistic matrix factorization (ILMF) to predict the latent associations between drugs and metabolites. Specifically, the proposed ILMF integrates drug-drug interaction, metabolite-metabolite interaction, and drug-metabolite interaction into this framework, to model the probability that a drug would interact with a metabolite. Moreover, we exploit inductive matrix completion to guide the learning of projection matrices U and V that depend on the low-dimensional feature representation matrices of drugs and metabolites: Fm and Fd . These two matrices can be obtained by fusing multiple data sources. Thus, Fd U and Fm V can be viewed as drug-specific and metabolite-specific latent representations, different from classical LMF. Furthermore, we utilize the Vicus spectral matrix that reveals the refined local geometrical structure inherent in the original data to encode the relationships between drugs and metabolites. Extensive experiments are conducted on a manually curated "DrugMetaboliteAtlas" dataset. The experimental results show that ILMF can achieve competitive performance compared with other state-of-the-art approaches, which demonstrates its effectiveness in predicting potential drug-metabolite associations.
Collapse
Affiliation(s)
- Yuanyuan Ma
- School of Computer and Information Engineering, Anyang Normal University, Anyang, China
| | - Lifang Liu
- School of Education, Anyang Normal University, Anyang, China
| | - Qianjun Chen
- School of Computer, Central China Normal University, Wuhan, China
| | - Yingjun Ma
- School of Applied Mathematics, Xiamen University of Technology, Xiamen, China
| |
Collapse
|
14
|
Xu D, Xu H, Zhang Y, Wang M, Chen W, Gao R. MDAKRLS: Predicting human microbe-disease association based on Kronecker regularized least squares and similarities. J Transl Med 2021; 19:66. [PMID: 33579301 PMCID: PMC7881563 DOI: 10.1186/s12967-021-02732-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 02/01/2021] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Microbes are closely related to human health and diseases. Identification of disease-related microbes is of great significance for revealing the pathological mechanism of human diseases and understanding the interaction mechanisms between microbes and humans, which is also useful for the prevention, diagnosis and treatment of human diseases. Considering the known disease-related microbes are still insufficient, it is necessary to develop effective computational methods and reduce the time and cost of biological experiments. METHODS In this work, we developed a novel computational method called MDAKRLS to discover potential microbe-disease associations (MDAs) based on the Kronecker regularized least squares. Specifically, we introduced the Hamming interaction profile similarity to measure the similarities of microbes and diseases besides Gaussian interaction profile kernel similarity. In addition, we introduced the Kronecker product to construct two kinds of Kronecker similarities between microbe-disease pairs. Then, we designed the Kronecker regularized least squares with different Kronecker similarities to obtain prediction scores, respectively, and calculated the final prediction scores by integrating the contributions of different similarities. RESULTS The AUCs value of global leave-one-out cross-validation and 5-fold cross-validation achieved by MDAKRLS were 0.9327 and 0.9023 ± 0.0015, which were significantly higher than five state-of-the-art methods used for comparison. Comparison results demonstrate that MDAKRLS has faster computing speed under two kinds of frameworks. In addition, case studies of inflammatory bowel disease (IBD) and asthma further showed 19 (IBD), 19 (asthma) of the top 20 prediction disease-related microbes could be verified by previously published biological or medical literature. CONCLUSIONS All the evaluation results adequately demonstrated that MDAKRLS has an effective and reliable prediction performance. It may be a useful tool to seek disease-related new microbes and help biomedical researchers to carry out follow-up studies.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China.
| | - Wei Chen
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| |
Collapse
|
15
|
Wen Z, Yan C, Duan G, Li S, Wu FX, Wang J. A survey on predicting microbe-disease associations: biological data and computational methods. Brief Bioinform 2020; 22:5881365. [PMID: 34020541 DOI: 10.1093/bib/bbaa157] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/18/2020] [Accepted: 06/22/2020] [Indexed: 02/06/2023] Open
Abstract
Various microbes have proved to be closely related to the pathogenesis of human diseases. While many computational methods for predicting human microbe-disease associations (MDAs) have been developed, few systematic reviews on these methods have been reported. In this study, we provide a comprehensive overview of the existing methods. Firstly, we introduce the data used in existing MDA prediction methods. Secondly, we classify those methods into different categories by their nature and describe their algorithms and strategies in detail. Next, experimental evaluations are conducted on representative methods using different similarity data and calculation methods to compare their prediction performances. Based on the principles of computational methods and experimental results, we discuss the advantages and disadvantages of those methods and propose suggestions for the improvement of prediction performances. Considering the problems of the MDA prediction at present stage, we discuss future work from three perspectives including data, methods and formulations at the end.
Collapse
Affiliation(s)
- Zhongqi Wen
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Cheng Yan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
16
|
Fan Y, Chen M, Zhu Q, Wang W. Inferring Disease-Associated Microbes Based on Multi-Data Integration and Network Consistency Projection. Front Bioeng Biotechnol 2020; 8:831. [PMID: 32850711 PMCID: PMC7418576 DOI: 10.3389/fbioe.2020.00831] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 06/29/2020] [Indexed: 12/18/2022] Open
Abstract
Plenty of microbes in our human body play a vital role in the process of cell physiology. In recent years, there is accumulating evidence indicating that microbes are closely related to many complex human diseases. In-depth investigation of disease-associated microbes can contribute to understanding the pathogenesis of diseases and thus provide novel strategies for the treatment, diagnosis, and prevention of diseases. To date, many computational models have been proposed for predicting microbe-disease associations using available similarity networks. However, these similarity networks are not effectively fused. In this study, we proposed a novel computational model based on multi-data integration and network consistency projection for Human Microbe-Disease Associations Prediction (HMDA-Pred), which fuses multiple similarity networks by a linear network fusion method. HMDA-Pred yielded AUC values of 0.9589 and 0.9361 ± 0.0037 in the experiments of leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, in case studies, 10, 8, and 10 out of the top 10 predicted microbes of asthma, colon cancer, and inflammatory bowel disease were confirmed by the literatures, respectively.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
| | | | | | | |
Collapse
|
17
|
Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbe-disease associations via graph attention networks with inductive matrix completion. Brief Bioinform 2020; 22:5876591. [PMID: 32725163 DOI: 10.1093/bib/bbaa146] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 06/07/2020] [Accepted: 06/11/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION human microbes play a critical role in an extensive range of complex human diseases and become a new target in precision medicine. In silico methods of identifying microbe-disease associations not only can provide a deep insight into understanding the pathogenic mechanism of complex human diseases but also assist pharmacologists to screen candidate targets for drug development. However, the majority of existing approaches are based on linear models or label propagation, which suffers from limitations in capturing nonlinear associations between microbes and diseases. Besides, it is still a great challenge for most previous methods to make predictions for new diseases (or new microbes) with few or without any observed associations. RESULTS in this work, we construct features for microbes and diseases by fully exploiting multiply sources of biomedical data, and then propose a novel deep learning framework of graph attention networks with inductive matrix completion for human microbe-disease association prediction, named GATMDA. To our knowledge, this is the first attempt to leverage graph attention networks for this important task. In particular, we develop an optimized graph attention network with talking-heads to learn representations for nodes (i.e. microbes and diseases). To focus on more important neighbours and filter out noises, we further design a bi-interaction aggregator to enforce representation aggregation of similar neighbours. In addition, we combine inductive matrix completion to reconstruct microbe-disease associations to capture the complicated associations between diseases and microbes. Comprehensive experiments on two data sets (i.e. HMDAD and Disbiome) demonstrated that our proposed model consistently outperformed baseline methods. Case studies on two diseases, i.e. asthma and inflammatory bowel disease, further confirmed the effectiveness of our proposed model of GATMDA. AVAILABILITY python codes and data set are available at: https://github.com/yahuilong/GATMDA. CONTACT luojiawei@hnu.edu.cn.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China.,School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China
| | - Yu Zhang
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Yan Xia
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China
| |
Collapse
|
18
|
Ma Y, Liu G, Ma Y, Chen Q. Integrative Analysis for Identifying Co-Modules of Microbe-Disease Data by Matrix Tri-Factorization With Phylogenetic Information. Front Genet 2020; 11:83. [PMID: 32153643 PMCID: PMC7048008 DOI: 10.3389/fgene.2020.00083] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 01/24/2020] [Indexed: 12/29/2022] Open
Abstract
Microbe-disease association relationship mining is drawing more and more attention due to its potential in capturing disease-related microbes. Hence, it is essential to develop new tools or algorithms to study the complex pathogenic mechanism of microbe-related diseases. However, previous research studies mainly focused on the paradigm of “one disease, one microbe,” rarely investigated the cooperation and associations between microbes, diseases or microbe-disease co-modules from system level. In this study, we propose a novel two-level module identifying algorithm (MDNMF) based on nonnegative matrix tri-factorization which integrates two similarity matrices (disease and microbe similarity matrices) and one microbe-disease association matrix into the objective of MDNMF. MDNMF can identify the modules from different levels and reveal the connections between these modules. In order to improve the efficiency and effectiveness of MDNMF, we also introduce human symptoms-disease network and microbial phylogenetic distance into this model. Furthermore, we applied it to HMDAD dataset and compared it with two NMF-based methods to demonstrate its effectiveness. The experimental results show that MDNMF can obtain better performance in terms of enrichment index (EI) and the number of significantly enriched taxon sets. This demonstrates the potential of MDNMF in capturing microbial modules that have significantly biological function implications.
Collapse
Affiliation(s)
- Yuanyuan Ma
- School of Computer and Information Engineering, Anyang Normal University, Anyang, China
| | - Guoying Liu
- School of Computer and Information Engineering, Anyang Normal University, Anyang, China
| | - Yingjun Ma
- School of Computer, Central China Normal University, Wuhan, China
| | - Qianjun Chen
- School of Computer, Central China Normal University, Wuhan, China.,School of Life Science, Hubei University, Wuhan, China
| |
Collapse
|
19
|
Li S, Xie M, Liu X. A Novel Approach Based on Bipartite Network Recommendation and KATZ Model to Predict Potential Micro-Disease Associations. Front Genet 2019; 10:1147. [PMID: 31803235 PMCID: PMC6873782 DOI: 10.3389/fgene.2019.01147] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 10/21/2019] [Indexed: 12/24/2022] Open
Abstract
Accumulating evidence indicates that the microbes colonizing human bodies have crucial effects on human health and the discovery of disease-related microbes will promote the discovery of biomarkers and drugs for the prevention, diagnosis, treatment, and prognosis of diseases. However clinical experiments of disease-microbe associations are time-consuming, laborious and expensive, and there are few methods for predicting potential microbe-disease association. Therefore, developing effective computational models utilizing the accumulated public data of clinically validated microbe-disease associations to identify novel disease-microbe associations is of practical importance. We propose a novel method based on the KATZ model and Bipartite Network Recommendation Algorithm (KATZBNRA) to discover potential associations between microbes and diseases. We calculate the Gaussian interaction profile kernel similarity of diseases and microbes based on validated disease-microbe associations. Then, we construct a bipartite graph and execute a bipartite network recommendation algorithm. Finally, we integrate the disease similarity, microbe similarity and bipartite network recommendation score to obtain the final score, which is used to infer whether there are some novel disease-microbe interactions. To evaluate the predictive power of KATZBNRA, we tested it with the walk length 2 using global leave-one-out cross validation (LOOV), two-fold and five-fold cross validations, with AUCs of 0.9098, 0.8463 and 0.8969, respectively. The test results also show that KATZBNRA is more accurate than two recent similar methods KATZHMDA and BNPMDA.
Collapse
Affiliation(s)
- Shiru Li
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Xinqiu Liu
- Hunan Vocational College of Engineering, Changsha, China
| |
Collapse
|
20
|
Long Y, Luo J. WMGHMDA: a novel weighted meta-graph-based model for predicting human microbe-disease association on heterogeneous information network. BMC Bioinformatics 2019; 20:541. [PMID: 31675979 PMCID: PMC6824056 DOI: 10.1186/s12859-019-3066-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 09/02/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND An increasing number of biological and clinical evidences have indicated that the microorganisms significantly get involved in the pathological mechanism of extensive varieties of complex human diseases. Inferring potential related microbes for diseases can not only promote disease prevention, diagnosis and treatment, but also provide valuable information for drug development. Considering that experimental methods are expensive and time-consuming, developing computational methods is an alternative choice. However, most of existing methods are biased towards well-characterized diseases and microbes. Furthermore, existing computational methods are limited in predicting potential microbes for new diseases. RESULTS Here, we developed a novel computational model to predict potential human microbe-disease associations (MDAs) based on Weighted Meta-Graph (WMGHMDA). We first constructed a heterogeneous information network (HIN) by combining the integrated microbe similarity network, the integrated disease similarity network and the known microbe-disease bipartite network. And then, we implemented iteratively pre-designed Weighted Meta-Graph search algorithm on the HIN to uncover possible microbe-disease pairs by cumulating the contribution values of weighted meta-graphs to the pairs as their probability scores. Depending on contribution potential, we described the contribution degree of different types of meta-graphs to a microbe-disease pair with bias rating. Meta-graph with higher bias rating will be assigned greater weight value when calculating probability scores. CONCLUSIONS The experimental results showed that WMGHMDA outperformed some state-of-the-art methods with average AUCs of 0.9288, 0.9068 ±0.0031 in global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. In the case studies, 9, 19, 37 and 10, 20, 45 out of top-10, 20, 50 candidate microbes were manually verified by previous reports for asthma and inflammatory bowel disease (IBD), respectively. Furthermore, three common human diseases (Crohn's disease, Liver cirrhosis, Type 1 diabetes) were adopted to demonstrate that WMGHMDA could be efficiently applied to make predictions for new diseases. In summary, WMGHMDA has a high potential in predicting microbe-disease associations.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
21
|
Xie G, Huang Z, Liu Z, Lin Z, Ma L. NCPHLDA: a novel method for human lncRNA–disease association prediction based on network consistency projection. Mol Omics 2019; 15:442-450. [DOI: 10.1039/c9mo00092e] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In recent years, an increasing number of biological experiments and clinical reports have shown that lncRNA is closely related to the development of various complex human diseases.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science
- Guangdong University of Technology
- Guangzhou
- China
| | - Zecheng Huang
- School of Computer Science
- Guangdong University of Technology
- Guangzhou
- China
| | - Zhenguo Liu
- Department of Thoracic Surgery
- The First Affiliated Hospital of Sun Yat-sen University
- Guangzhou
- China
| | - Zhiyi Lin
- School of Computer Science
- Guangdong University of Technology
- Guangzhou
- China
| | - Lei Ma
- Institute of Automation
- Chinese Academy of Sciences
- Beijing
- China
| |
Collapse
|