1
|
Lu Y, Hui F, Zhou G, Xia J. MicrobiomeNet: exploring microbial associations and metabolic profiles for mechanistic insights. Nucleic Acids Res 2024:gkae944. [PMID: 39441071 DOI: 10.1093/nar/gkae944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 09/30/2024] [Accepted: 10/08/2024] [Indexed: 10/25/2024] Open
Abstract
The growing volumes of microbiome studies over the past decade have revealed a wide repertoire of microbial associations under diverse conditions. Microbes produce small molecules to interact with each other as well as to modulate their environments. Their metabolic profiles hold the key to understanding these association patterns for translational applications. Based on this concept, we developed MicrobiomeNet, a comprehensive database that integrates microbial associations with their metabolic profiles for mechanistic insights. It currently contains a total of ∼5.8 million known microbial associations, coupled with >12 400 genome-scale metabolic models (GEMs) covering ∼6000 microbial species. Users can intuitively explore microbial associations and compare their corresponding metabolic profiles. Our case studies show that MicrobiomeNet can provide mechanistic insights that are consistent with the literature. MicrobiomeNet is freely available at https://www.microbiomenet.com/.
Collapse
Affiliation(s)
- Yao Lu
- Institute of Parasitology, McGill University, Quebec, Canada
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
| | - Fiona Hui
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Guangyan Zhou
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Quebec, Canada
- Department of Microbiology and Immunology, McGill University, Quebec, Canada
| |
Collapse
|
2
|
Yang M, Wang Z, Yan Z, Wang W, Zhu Q, Jin C. DNASimCLR: a contrastive learning-based deep learning approach for gene sequence data classification. BMC Bioinformatics 2024; 25:328. [PMID: 39402441 PMCID: PMC11476100 DOI: 10.1186/s12859-024-05955-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 10/09/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND The rapid advancements in deep neural network models have significantly enhanced the ability to extract features from microbial sequence data, which is critical for addressing biological challenges. However, the scarcity and complexity of labeled microbial data pose substantial difficulties for supervised learning approaches. To address these issues, we propose DNASimCLR, an unsupervised framework designed for efficient gene sequence data feature extraction. RESULTS DNASimCLR leverages convolutional neural networks and the SimCLR framework, based on contrastive learning, to extract intricate features from diverse microbial gene sequences. Pre-training was conducted on two classic large scale unlabelled datasets encompassing metagenomes and viral gene sequences. Subsequent classification tasks were performed by fine-tuning the pretrained model using the previously acquired model. Our experiments demonstrate that DNASimCLR is at least comparable to state-of-the-art techniques for gene sequence classification. For convolutional neural network-based approaches, DNASimCLR surpasses the latest existing methods, clearly establishing its superiority over the state-of-the-art CNN-based feature extraction techniques. Furthermore, the model exhibits superior performance across diverse tasks in analyzing biological sequence data, showcasing its robust adaptability. CONCLUSIONS DNASimCLR represents a robust and database-agnostic solution for gene sequence classification. Its versatility allows it to perform well in scenarios involving novel or previously unseen gene sequences, making it a valuable tool for diverse applications in genomics.
Collapse
Affiliation(s)
- Minghao Yang
- Shandong University, Weihai, People's Republic of China
- Beijing Research Institute of Automation for Machinery Industry, Beijing, People's Republic of China
| | - Zehua Wang
- Beijing Research Institute of Automation for Machinery Industry, Beijing, People's Republic of China
| | - Zizhuo Yan
- Beijing Research Institute of Automation for Machinery Industry, Beijing, People's Republic of China
| | - Wenxiang Wang
- Beijing Research Institute of Automation for Machinery Industry, Beijing, People's Republic of China
| | - Qian Zhu
- Shandong University, Weihai, People's Republic of China
| | - Changlong Jin
- Shandong University, Weihai, People's Republic of China.
| |
Collapse
|
3
|
Wang S, Liu JX, Li F, Wang J, Gao YL. M 3HOGAT: A Multi-View Multi-Modal Multi-Scale High-Order Graph Attention Network for Microbe-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:6259-6267. [PMID: 39012741 DOI: 10.1109/jbhi.2024.3429128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Numerous scientific studies have found a link between diverse microorganisms in the human body and complex human diseases. Because traditional experimental approaches are time-consuming and expensive, using computational methods to identify microbes correlated with diseases is critical. In this paper, a new microbe-disease association prediction model is proposed that combines a multi-view multi-modal network and a multi-scale feature fusion mechanism, called M3HOGAT. Firstly, a microbe-disease association network and multiple similarity views are constructed based on multi-source information. Then, consider that neighbor information from disparate orders might be more adept at learning node representations. Consequently, the higher-order graph attention network (HOGAT) is devised to aggregate neighbor information from disparate orders to extract microbe and disease features from different networks and views. Given that the embedding features of microbe and disease from different views possess varying importance, a multi-scale feature fusion mechanism is employed to learn their interaction information, thereby generating the final feature of microbes and diseases. Finally, an inner product decoder is used to reconstruct the microbe-disease association matrix. Compared with five state-of-the-art methods on the HMDAD and Disbiome datasets, the results of 5-fold cross-validations show that M3HOGAT achieves the best performance. Furthermore, case studies on asthma and obesity confirm the effectiveness of M3HOGAT in identifying potential disease-related microbes.
Collapse
|
4
|
Chen H, Chen K. Predicting disease-associated microbes based on similarity fusion and deep learning. Brief Bioinform 2024; 25:bbae550. [PMID: 39504483 PMCID: PMC11540060 DOI: 10.1093/bib/bbae550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/15/2024] [Accepted: 10/14/2024] [Indexed: 11/08/2024] Open
Abstract
Increasing studies have revealed the critical roles of human microbiome in a wide variety of disorders. Identification of disease-associated microbes might improve our knowledge and understanding of disease pathogenesis and treatment. Computational prediction of microbe-disease associations would provide helpful guidance for further biomedical screening, which has received lots of research interest in bioinformatics. In this study, a deep learning-based computational approach entitled SGJMDA is presented for predicting microbe-disease associations. Specifically, SGJMDA first fuses multiple similarities of microbes and diseases using a nonlinear strategy, and extracts feature information from homogeneous networks composed of the fused similarities via a graph convolution network. Second, a heterogeneous microbe-disease network is built to further capture the structural information of microbes and diseases by employing multi-neighborhood graph convolution network and jumping knowledge network. Finally, potential microbe-disease associations are inferred through computing the linear correlation coefficients of their embeddings. Results from cross-validation experiments show that SGJMDA outperforms 6 state-of-the-art computational methods. Furthermore, we carry out case studies on three important diseases using SGJMDA, in which 19, 20, and 11 predictions out of their top 20 results are successfully checked by the latest databases, respectively. The excellent performance of SGJMDA suggests that it could be a valuable and promising tool for inferring disease-associated microbes.
Collapse
Affiliation(s)
- Hailin Chen
- School of Information and Software Engineering, East China Jiaotong University, Nanchang 330013, China
| | - Kuan Chen
- School of Information and Software Engineering, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
5
|
Shi K, Huang K, Li L, Liu Q, Zhang Y, Zheng H. Predicting microbe-disease association based on graph autoencoder and inductive matrix completion with multi-similarities fusion. Front Microbiol 2024; 15:1438942. [PMID: 39355422 PMCID: PMC11443509 DOI: 10.3389/fmicb.2024.1438942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 08/02/2024] [Indexed: 10/03/2024] Open
Abstract
Background Clinical studies have demonstrated that microbes play a crucial role in human health and disease. The identification of microbe-disease interactions can provide insights into the pathogenesis and promote the diagnosis, treatment, and prevention of disease. Although a large number of computational methods are designed to screen novel microbe-disease associations, the accurate and efficient methods are still lacking due to data inconsistence, underutilization of prior information, and model performance. Methods In this study, we proposed an improved deep learning-based framework, named GIMMDA, to identify latent microbe-disease associations, which is based on graph autoencoder and inductive matrix completion. By co-training the information from microbe and disease space, the new representations of microbes and diseases are used to reconstruct microbe-disease association in the end-to-end framework. In particular, a similarity fusion strategy is conducted to improve prediction performance. Results The experimental results show that the performance of GIMMDA is competitive with that of existing state-of-the-art methods on 3 datasets (i.e., HMDAD, Disbiome, and multiMDA). In particular, it performs best with the area under the receiver operating characteristic curve (AUC) of 0.9735, 0.9156, 0.9396 on abovementioned 3 datasets, respectively. And the result also confirms that different similarity fusions can improve the prediction performance. Furthermore, case studies on two diseases, i.e., asthma and obesity, validate the effectiveness and reliability of our proposed model. Conclusion The proposed GIMMDA model show a strong capability in predicting microbe-disease associations. We expect that GPUDMDA will help identify potential microbe-related diseases in the future.
Collapse
Affiliation(s)
- Kai Shi
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent Systems, Guilin University of Technology, Guilin, China
| | - Kai Huang
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Lin Li
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Qiaohui Liu
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Yi Zhang
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| | - Huilin Zheng
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, China
| |
Collapse
|
6
|
Chen Z, Zhang L, Li J, Chen H. Microbe-disease associations prediction by graph regularized non-negative matrix factorization with L 2 , 1 $$ {L}_{2,1} $$ norm regularization terms. J Cell Mol Med 2024; 28:e18553. [PMID: 39239860 PMCID: PMC11377990 DOI: 10.1111/jcmm.18553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/19/2024] [Accepted: 07/09/2024] [Indexed: 09/07/2024] Open
Abstract
Microbes are involved in a wide range of biological processes and are closely associated with disease. Inferring potential disease-associated microbes as the biomarkers or drug targets may help prevent, diagnose and treat complex human diseases. However, biological experiments are time-consuming and expensive. In this study, we introduced a new method called iPALM-GLMF, which modelled microbe-disease association prediction as a problem of non-negative matrix factorization with graph dual regularization terms andL 2 , 1 $$ {L}_{2,1} $$ norm regularization terms. The graph dual regularization terms were used to capture potential features in the microbe and disease space, and theL 2 , 1 $$ {L}_{2,1} $$ norm regularization terms were used to ensure the sparsity of the feature matrices obtained from the non-negative matrix factorization and to improve the interpretability. To solve the model, iPALM-GLMF used a non-negative double singular value decomposition to initialize the matrix factorization and adopted an inertial Proximal Alternating Linear Minimization iterative process to obtain the final matrix factorization results. As a result, iPALM-GLMF performed better than other existing methods in leave-one-out cross-validation and fivefold cross-validation. In addition, case studies of different diseases demonstrated that iPALM-GLMF could effectively predict potential microbial-disease associations. iPALM-GLMF is publicly available at https://github.com/LiangzheZhang/iPALM-GLMF.
Collapse
Affiliation(s)
- Ziwei Chen
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Liangzhe Zhang
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Jingyi Li
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Hang Chen
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| |
Collapse
|
7
|
Zhu H, Hao H, Yu L. Identification of microbe-disease signed associations via multi-scale variational graph autoencoder based on signed message propagation. BMC Biol 2024; 22:172. [PMID: 39148051 PMCID: PMC11328394 DOI: 10.1186/s12915-024-01968-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 08/01/2024] [Indexed: 08/17/2024] Open
Abstract
BACKGROUND Plenty of clinical and biomedical research has unequivocally highlighted the tremendous significance of the human microbiome in relation to human health. Identifying microbes associated with diseases is crucial for early disease diagnosis and advancing precision medicine. RESULTS Considering that the information about changes in microbial quantities under fine-grained disease states helps to enhance a comprehensive understanding of the overall data distribution, this study introduces MSignVGAE, a framework for predicting microbe-disease sign associations using signed message propagation. MSignVGAE employs a graph variational autoencoder to model noisy signed association data and extends the multi-scale concept to enhance representation capabilities. A novel strategy for propagating signed message in signed networks addresses heterogeneity and consistency among nodes connected by signed edges. Additionally, we utilize the idea of denoising autoencoder to handle the noise in similarity feature information, which helps overcome biases in the fused similarity data. MSignVGAE represents microbe-disease associations as a heterogeneous graph using similarity information as node features. The multi-class classifier XGBoost is utilized to predict sign associations between diseases and microbes. CONCLUSIONS MSignVGAE achieves AUROC and AUPR values of 0.9742 and 0.9601, respectively. Case studies on three diseases demonstrate that MSignVGAE can effectively capture a comprehensive distribution of associations by leveraging signed information.
Collapse
Affiliation(s)
- Huan Zhu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Hongxia Hao
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
8
|
Jacob T, Sindhu S, Hasan A, Malik MZ, Arefanian H, Al-Rashed F, Nizam R, Kochumon S, Thomas R, Bahman F, Shenouda S, Wilson A, Akther N, Al-Roub A, Abukhalaf N, Albeloushi S, Abu-Farha M, Al Madhoun A, Alzaid F, Thanaraj TA, Koistinen HA, Tuomilehto J, Al-Mulla F, Ahmad R. Soybean oil-based HFD induces gut dysbiosis that leads to steatosis, hepatic inflammation and insulin resistance in mice. Front Microbiol 2024; 15:1407258. [PMID: 39165573 PMCID: PMC11334085 DOI: 10.3389/fmicb.2024.1407258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 07/09/2024] [Indexed: 08/22/2024] Open
Abstract
High-fat diets (HFDs) shape the gut microbiome and promote obesity, inflammation, and liver steatosis. Fish and soybean are part of a healthy diet; however, the impact of these fats, in the absence of sucrose, on gut microbial dysbiosis and its association with liver steatosis remains unclear. Here, we investigated the effect of sucrose-free soybean oil-and fish oil-based high fat diets (HFDs) (SF-Soy-HFD and SF-Fish-HFD, respectively) on gut dysbiosis, obesity, steatosis, hepatic inflammation, and insulin resistance. C57BL/6 mice were fed these HFDs for 24 weeks. Both diets had comparable effects on liver and total body weights. But 16S-rRNA sequencing of the gut content revealed induction of gut dysbiosis at different taxonomic levels. The microbial communities were clearly separated, showing differential dysbiosis between the two HFDs. Compared with the SF-Fish-HFD control group, the SF-Soy-HFD group had an increased abundance of Bacteroidetes, Firmicutes, and Deferribacteres, but a lower abundance of Verrucomicrobia. The Clostridia/Bacteroidia (C/B) ratio was higher in the SF-Soy-HFD group (3.11) than in the SF-Fish-HFD group (2.5). Conversely, the Verrucomicrobiacae/S24_7 (also known as Muribaculaceae family) ratio was lower in the SF-Soy-HFD group (0.02) than that in the SF-Fish-HFD group (0.75). The SF-Soy-HFD group had a positive association with S24_7, Clostridiales, Allobaculum, Coriobacteriaceae, Adlercreutzia, Christensenellaceae, Lactococcus, and Oscillospira, but was related to a lower abundance of Akkermansia, which maintains gut barrier integrity. The gut microbiota in the SF-Soy-HFD group had predicted associations with host genes related to fatty liver and inflammatory pathways. Mice fed the SF-Soy-HFD developed liver steatosis and showed increased transcript levels of genes associated with de novo lipogenesis (Acaca, Fasn, Scd1, Elovl6) and cholesterol synthesis (Hmgcr) pathways compared to those in the SF-Fish-HFD-group. No differences were observed in the expression of fat uptake genes (Cd36 and Fabp1). The expression of the fat efflux gene (Mttp) was reduced in the SF-Soy-HFD group. Moreover, hepatic inflammation markers (Tnfa and Il1b) were notably expressed in SF-Soy-HFD-fed mice. In conclusion, SF-Soy-HFD feeding induced gut dysbiosis in mice, leading to steatosis, hepatic inflammation, and impaired glucose homeostasis.
Collapse
Affiliation(s)
- Texy Jacob
- Dasman Diabetes Institute, Dasman, Kuwait
| | | | - Amal Hasan
- Dasman Diabetes Institute, Dasman, Kuwait
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Fawaz Alzaid
- Dasman Diabetes Institute, Dasman, Kuwait
- INSERM UMR-S1151, CNRS UMR-S8253, Institut Necker Enfants Malades, Université Paris Cité, Paris, France
| | | | - Heikki A Koistinen
- Department of Medicine, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Jaakko Tuomilehto
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
| | | | | |
Collapse
|
9
|
Chen J, Zhu Y, Yuan Q. Predicting potential microbe-disease associations based on dual branch graph convolutional network. J Cell Mol Med 2024; 28:e18571. [PMID: 39086148 PMCID: PMC11291560 DOI: 10.1111/jcmm.18571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 06/15/2024] [Accepted: 06/27/2024] [Indexed: 08/02/2024] Open
Abstract
Studying the association between microbes and diseases not only aids in the prevention and diagnosis of diseases, but also provides crucial theoretical support for new drug development and personalized treatment. Due to the time-consuming and costly nature of laboratory-based biological tests to confirm the relationship between microbes and diseases, there is an urgent need for innovative computational frameworks to anticipate new associations between microbes and diseases. Here, we propose a novel computational approach based on a dual branch graph convolutional network (GCN) module, abbreviated as DBGCNMDA, for identifying microbe-disease associations. First, DBGCNMDA calculates the similarity matrix of diseases and microbes by integrating functional similarity and Gaussian association spectrum kernel (GAPK) similarity. Then, semantic information from different biological networks is extracted by two GCN modules from different perspectives. Finally, the scores of microbe-disease associations are predicted based on the extracted features. The main innovation of this method lies in the use of two types of information for microbe/disease similarity assessment. Additionally, we extend the disease nodes to address the issue of insufficient features due to low data dimensionality. We optimize the connectivity between the homogeneous entities using random walk with restart (RWR), and then use the optimized similarity matrix as the initial feature matrix. In terms of network understanding, we design a dual branch GCN module, namely GlobalGCN and LocalGCN, to fine-tune node representations by introducing side information, including homologous neighbour nodes. We evaluate the accuracy of the DBGCNMDA model using five-fold cross-validation (5-fold-CV) technique. The results show that the area under the receiver operating characteristic curve (AUC) and area under the precision versus recall curve (AUPR) of the DBGCNMDA model in the 5-fold-CV are 0.9559 and 0.9630, respectively. The results from the case studies using published experimental data confirm a significant number of predicted associations, indicating that DBGCNMDA is an effective tool for predicting potential microbe-disease associations.
Collapse
Affiliation(s)
- Jing Chen
- School of Electronic and Information EngineeringSuzhou University of Science and TechnologySuzhouChina
| | - Yongjun Zhu
- School of Electronic and Information EngineeringSuzhou University of Science and TechnologySuzhouChina
| | - Qun Yuan
- Department of Respiratory Medicine, The Affiliated Suzhou Hospital of NanjingUniversity Medical SchoolSuzhouChina
| |
Collapse
|
10
|
Zhang C, Zhang Z, Zhang F, Zeng B, Liu X, Wang L. A computational model for potential microbe-disease association detection based on improved graph convolutional networks and multi-channel autoencoders. Front Microbiol 2024; 15:1435408. [PMID: 39144226 PMCID: PMC11322764 DOI: 10.3389/fmicb.2024.1435408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 07/05/2024] [Indexed: 08/16/2024] Open
Abstract
Introduction Accumulating evidence shows that human health and disease are closely related to the microbes in the human body. Methods In this manuscript, a new computational model based on graph attention networks and sparse autoencoders, called GCANCAE, was proposed for inferring possible microbe-disease associations. In GCANCAE, we first constructed a heterogeneous network by combining known microbe-disease relationships, disease similarity, and microbial similarity. Then, we adopted the improved GCN and the CSAE to extract neighbor relations in the adjacency matrix and novel feature representations in heterogeneous networks. After that, in order to estimate the likelihood of a potential microbe associated with a disease, we integrated these two types of representations to create unique eigenmatrices for diseases and microbes, respectively, and obtained predicted scores for potential microbe-disease associations by calculating the inner product of these two types of eigenmatrices. Results and discussion Based on the baseline databases such as the HMDAD and the Disbiome, intensive experiments were conducted to evaluate the prediction ability of GCANCAE, and the experimental results demonstrated that GCANCAE achieved better performance than state-of-the-art competitive methods under the frameworks of both 2-fold and 5-fold CV. Furthermore, case studies of three categories of common diseases, such as asthma, irritable bowel syndrome (IBS), and type 2 diabetes (T2D), confirmed the efficiency of GCANCAE.
Collapse
Affiliation(s)
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| | | | | | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| |
Collapse
|
11
|
Kochumon S, Malik MZ, Sindhu S, Arefanian H, Jacob T, Bahman F, Nizam R, Hasan A, Thomas R, Al-Rashed F, Shenouda S, Wilson A, Albeloushi S, Almansour N, Alhamar G, Al Madhoun A, Alzaid F, Thanaraj TA, Koistinen HA, Tuomilehto J, Al-Mulla F, Ahmad R. Gut Dysbiosis Shaped by Cocoa Butter-Based Sucrose-Free HFD Leads to Steatohepatitis, and Insulin Resistance in Mice. Nutrients 2024; 16:1929. [PMID: 38931284 PMCID: PMC11207001 DOI: 10.3390/nu16121929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND High-fat diets cause gut dysbiosis and promote triglyceride accumulation, obesity, gut permeability changes, inflammation, and insulin resistance. Both cocoa butter and fish oil are considered to be a part of healthy diets. However, their differential effects on gut microbiome perturbations in mice fed high concentrations of these fats, in the absence of sucrose, remains to be elucidated. The aim of the study was to test whether the sucrose-free cocoa butter-based high-fat diet (C-HFD) feeding in mice leads to gut dysbiosis that associates with a pathologic phenotype marked by hepatic steatosis, low-grade inflammation, perturbed glucose homeostasis, and insulin resistance, compared with control mice fed the fish oil based high-fat diet (F-HFD). RESULTS C57BL/6 mice (5-6 mice/group) were fed two types of high fat diets (C-HFD and F-HFD) for 24 weeks. No significant difference was found in the liver weight or total body weight between the two groups. The 16S rRNA sequencing of gut bacterial samples displayed gut dysbiosis in C-HFD group, with differentially-altered microbial diversity or relative abundances. Bacteroidetes, Firmicutes, and Proteobacteria were highly abundant in C-HFD group, while the Verrucomicrobia, Saccharibacteria (TM7), Actinobacteria, and Tenericutes were more abundant in F-HFD group. Other taxa in C-HFD group included the Bacteroides, Odoribacter, Sutterella, Firmicutes bacterium (AF12), Anaeroplasma, Roseburia, and Parabacteroides distasonis. An increased Firmicutes/Bacteroidetes (F/B) ratio in C-HFD group, compared with F-HFD group, indicated the gut dysbiosis. These gut bacterial changes in C-HFD group had predicted associations with fatty liver disease and with lipogenic, inflammatory, glucose metabolic, and insulin signaling pathways. Consistent with its microbiome shift, the C-HFD group showed hepatic inflammation and steatosis, high fasting blood glucose, insulin resistance, increased hepatic de novo lipogenesis (Acetyl CoA carboxylases 1 (Acaca), Fatty acid synthase (Fasn), Stearoyl-CoA desaturase-1 (Scd1), Elongation of long-chain fatty acids family member 6 (Elovl6), Peroxisome proliferator-activated receptor-gamma (Pparg) and cholesterol synthesis (β-(hydroxy β-methylglutaryl-CoA reductase (Hmgcr). Non-significant differences were observed regarding fatty acid uptake (Cluster of differentiation 36 (CD36), Fatty acid binding protein-1 (Fabp1) and efflux (ATP-binding cassette G1 (Abcg1), Microsomal TG transfer protein (Mttp) in C-HFD group, compared with F-HFD group. The C-HFD group also displayed increased gene expression of inflammatory markers including Tumor necrosis factor alpha (Tnfa), C-C motif chemokine ligand 2 (Ccl2), and Interleukin-12 (Il12), as well as a tendency for liver fibrosis. CONCLUSION These findings suggest that the sucrose-free C-HFD feeding in mice induces gut dysbiosis which associates with liver inflammation, steatosis, glucose intolerance and insulin resistance.
Collapse
Affiliation(s)
- Shihab Kochumon
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Md. Zubbair Malik
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Sardar Sindhu
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Hossein Arefanian
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Texy Jacob
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Fatemah Bahman
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Rasheeba Nizam
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Amal Hasan
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Reeby Thomas
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Fatema Al-Rashed
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Steve Shenouda
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Ajit Wilson
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Shaima Albeloushi
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Nourah Almansour
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Ghadeer Alhamar
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Ashraf Al Madhoun
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Fawaz Alzaid
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
- Université Paris Cité, INSERM UMR-S1151, CNRS UMR-S8253, Institut Necker Enfants Malades, F-75015 Paris, France
| | - Thangavel Alphonse Thanaraj
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Heikki A. Koistinen
- Department of Medicine, University of Helsinki and Helsinki University Hospital, 00029 Helsinki, Finland;
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, P.O. Box 30, 00271 Helsinki, Finland;
- Minerva Foundation Institute for Medical Research, 00290 Helsinki, Finland
| | - Jaakko Tuomilehto
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, P.O. Box 30, 00271 Helsinki, Finland;
- Department of Public Health, University of Helsinki, 00014 Helsinki, Finland
| | - Fahd Al-Mulla
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| | - Rasheed Ahmad
- Dasman Diabetes Institute, Dasman 15462, Kuwait; (S.K.); (M.Z.M.); (S.S.); (H.A.); (T.J.); (F.B.); (R.N.); (A.H.); (R.T.); (F.A.-R.); (S.S.); (A.W.); (S.A.); (N.A.); (G.A.); (A.A.M.); (F.A.); (T.A.T.); (F.A.-M.)
| |
Collapse
|
12
|
Chen R, Xie G, Lin Z, Gu G, Yu Y, Yu J, Liu Z. Predicting Microbe-Disease Associations Based on a Linear Neighborhood Label Propagation Method with Multi-order Similarity Fusion Learning. Interdiscip Sci 2024; 16:345-360. [PMID: 38436840 DOI: 10.1007/s12539-024-00607-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/04/2024] [Accepted: 01/05/2024] [Indexed: 03/05/2024]
Abstract
Computational approaches employed for predicting potential microbe-disease associations often rely on similarity information between microbes and diseases. Therefore, it is important to obtain reliable similarity information by integrating multiple types of similarity information. However, existing similarity fusion methods do not consider multi-order fusion of similarity networks. To address this problem, a novel method of linear neighborhood label propagation with multi-order similarity fusion learning (MOSFL-LNP) is proposed to predict potential microbe-disease associations. Multi-order fusion learning comprises two parts: low-order global learning and high-order feature learning. Low-order global learning is used to obtain common latent features from multiple similarity sources. High-order feature learning relies on the interactions between neighboring nodes to identify high-order similarities and learn deeper interactive network structures. Coefficients are assigned to different high-order feature learning modules to balance the similarities learned from different orders and enhance the robustness of the fusion network. Overall, by combining low-order global learning with high-order feature learning, multi-order fusion learning can capture both the shared and unique features of different similarity networks, leading to more accurate predictions of microbe-disease associations. In comparison to six other advanced methods, MOSFL-LNP exhibits superior prediction performance in the leave-one-out cross-validation and 5-fold validation frameworks. In the case study, the predicted 10 microbes associated with asthma and type 1 diabetes have an accuracy rate of up to 90% and 100%, respectively.
Collapse
Affiliation(s)
- Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Yi Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Zhenguo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
13
|
Chen Z, Zhang L, Li J, Fu M. MLFLHMDA: predicting human microbe-disease association based on multi-view latent feature learning. Front Microbiol 2024; 15:1353278. [PMID: 38371933 PMCID: PMC10869561 DOI: 10.3389/fmicb.2024.1353278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add L p , q -norms to the projection matrix to ensure the interpretability and sparsity of the model. Results The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
Collapse
|
14
|
Zhu H, Hao H, Yu L. Identifying disease-related microbes based on multi-scale variational graph autoencoder embedding Wasserstein distance. BMC Biol 2023; 21:294. [PMID: 38115088 PMCID: PMC10731776 DOI: 10.1186/s12915-023-01796-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND Enormous clinical and biomedical researches have demonstrated that microbes are crucial to human health. Identifying associations between microbes and diseases can not only reveal potential disease mechanisms, but also facilitate early diagnosis and promote precision medicine. Due to the data perturbation and unsatisfactory latent representation, there is a significant room for improvement. RESULTS In this work, we proposed a novel framework, Multi-scale Variational Graph AutoEncoder embedding Wasserstein distance (MVGAEW) to predict disease-related microbes, which had the ability to resist data perturbation and effectively generate latent representations for both microbes and diseases from the perspective of distribution. First, we calculated multiple similarities and integrated them through similarity network confusion. Subsequently, we obtained node latent representations by improved variational graph autoencoder. Ultimately, XGBoost classifier was employed to predict potential disease-related microbes. We also introduced multi-order node embedding reconstruction to enhance the representation capacity. We also performed ablation studies to evaluate the contribution of each section of our model. Moreover, we conducted experiments on common drugs and case studies, including Alzheimer's disease, Crohn's disease, and colorectal neoplasms, to validate the effectiveness of our framework. CONCLUSIONS Significantly, our model exceeded other currently state-of-the-art methods, exhibiting a great improvement on the HMDAD database.
Collapse
Affiliation(s)
- Huan Zhu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Hongxia Hao
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
15
|
Lu S, Liang Y, Li L, Miao R, Liao S, Zou Y, Yang C, Ouyang D. Predicting potential microbe-disease associations based on auto-encoder and graph convolution network. BMC Bioinformatics 2023; 24:476. [PMID: 38097930 PMCID: PMC10722760 DOI: 10.1186/s12859-023-05611-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 12/11/2023] [Indexed: 12/17/2023] Open
Abstract
The increasing body of research has consistently demonstrated the intricate correlation between the human microbiome and human well-being. Microbes can impact the efficacy and toxicity of drugs through various pathways, as well as influence the occurrence and metastasis of tumors. In clinical practice, it is crucial to elucidate the association between microbes and diseases. Although traditional biological experiments accurately identify this association, they are time-consuming, expensive, and susceptible to experimental conditions. Consequently, conducting extensive biological experiments to screen potential microbe-disease associations becomes challenging. The computational methods can solve the above problems well, but the previous computational methods still have the problems of low utilization of node features and the prediction accuracy needs to be improved. To address this issue, we propose the DAEGCNDF model predicting potential associations between microbes and diseases. Our model calculates four similar features for each microbe and disease. These features are fused to obtain a comprehensive feature matrix representing microbes and diseases. Our model first uses the graph convolutional network module to extract low-rank features with graph information of microbes and diseases, and then uses a deep sparse Auto-Encoder to extract high-rank features of microbe-disease pairs, after which the low-rank and high-rank features are spliced to improve the utilization of node features. Finally, Deep Forest was used for microbe-disease potential relationship prediction. The experimental results show that combining low-rank and high-rank features helps to improve the model performance and Deep Forest has better classification performance than the baseline model.
Collapse
Affiliation(s)
- Shanghui Lu
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Yong Liang
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China.
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.
| | - Le Li
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Rui Miao
- Basic Teaching Department, Zhuhai Campus of Zunyi Medical University, Zhuhai, 519041, Guangdong, China
| | - Shuilin Liao
- Faculty of Innovation Enginee, Macau University of Science and Technology, Avenida Wai Long, Taipa, 999078, Macao, Macao Special Administrative Region of China, China
| | - Yongfu Zou
- School of Mathematics and Physics, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Chengjun Yang
- School of Artificial Intelligence and Manufacturing, Hechi University, No. 42, Longjiang, Hechi, 546300, Guangxi, China
| | - Dong Ouyang
- School of Biomedical Engineering, Guangdong Medical University, No. 1, Xincheng, Zhanjiang, 523808, Guangdong, China
| |
Collapse
|
16
|
Santangelo B, Bada M, Hunter L, Lozupone C. Hypothesizing mechanistic links between microbes and disease using knowledge graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.01.569645. [PMID: 38106100 PMCID: PMC10723325 DOI: 10.1101/2023.12.01.569645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Knowledge graphs have found broad biomedical applications, providing useful representations of complex knowledge. Although plentiful evidence exists linking the gut microbiome to disease, mechanistic understanding of those relationships remains generally elusive. Here we demonstrate the potential of knowledge graphs to hypothesize plausible mechanistic accounts of host-microbe interactions in disease. To do so, we constructed a knowledge graph of linked microbes, genes and metabolites called MGMLink. Using a semantically constrained shortest path search through the graph and a novel path prioritization methodology based on cosine similarity, we show that this knowledge supports inference of mechanistic hypotheses that explain observed relationships between microbes and disease phenotypes. We discuss specific applications of this methodology in inflammatory bowel disease and Parkinson's disease. This approach enables mechanistic hypotheses surrounding the complex interactions between gut microbes and disease to be generated in a scalable and comprehensive manner.
Collapse
|
17
|
Sánchez-Valle J, Valencia A. Molecular bases of comorbidities: present and future perspectives. Trends Genet 2023; 39:773-786. [PMID: 37482451 DOI: 10.1016/j.tig.2023.06.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/12/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023]
Abstract
Co-occurrence of diseases decreases patient quality of life, complicates treatment choices, and increases mortality. Analyses of electronic health records present a complex scenario of comorbidity relationships that vary by age, sex, and cohort under study. The study of similarities between diseases using 'omics data, such as genes altered in diseases, gene expression, proteome, and microbiome, are fundamental to uncovering the origin of, and potential treatment for, comorbidities. Recent studies have produced a first generation of genetic interpretations for as much as 46% of the comorbidities described in large cohorts. Integrating different sources of molecular information and using artificial intelligence (AI) methods are promising approaches for the study of comorbidities. They may help to improve the treatment of comorbidities, including the potential repositioning of drugs.
Collapse
Affiliation(s)
- Jon Sánchez-Valle
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain.
| | - Alfonso Valencia
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain; ICREA, Barcelona, 08010, Spain.
| |
Collapse
|
18
|
Shen Y, Gao Y, Shi J, Huang Z, Dai R, Fu Y, Zhou Y, Kong W, Cui Q. MicroRNA-disease Network Analysis Repurposes Methotrexate for the Treatment of Abdominal Aortic Aneurysm in Mice. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1030-1042. [PMID: 36030000 PMCID: PMC10928436 DOI: 10.1016/j.gpb.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 07/15/2022] [Accepted: 08/19/2022] [Indexed: 06/15/2023]
Abstract
Abdominal aortic aneurysm (AAA) is a permanent dilatation of the abdominal aorta and is highly lethal. The main purpose of the current study is to search for noninvasive medical therapies for AAA, for which there is currently no effective drug therapy. Network medicine represents a cutting-edge technology, as analysis and modeling of disease networks can provide critical clues regarding the etiology of specific diseases and therapeutics that may be effective. Here, we proposed a novel algorithm to quantify disease relations based on a large accumulated microRNA-disease association dataset and then built a disease network covering 15 disease classes and 304 diseases. Analysis revealed some patterns for these diseases. For instance, diseases tended to be clustered and coherent in the network. Surprisingly, we found that AAA showed the strongest similarity with rheumatoid arthritis and systemic lupus erythematosus, both of which are autoimmune diseases, suggesting that AAA could be one type of autoimmune diseases in etiology. Based on this observation, we further hypothesized that drugs for autoimmune diseases could be repurposed for the prevention and therapy of AAA. Finally, animal experiments confirmed that methotrexate, a drug for autoimmune diseases, was able to alleviate the formation and development of AAA.
Collapse
Affiliation(s)
- Yicong Shen
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China
| | - Yuanxu Gao
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China; State Key Laboratory of Lunar and Planetary Sciences, Macau University of Science and Technology, Macao Special Administrative Region 999078, China; Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Jiangcheng Shi
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China; Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Zhou Huang
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China; Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Rongbo Dai
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China
| | - Yi Fu
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China
| | - Yuan Zhou
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China; Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Wei Kong
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China.
| | - Qinghua Cui
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing 100191, China; Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China.
| |
Collapse
|
19
|
Hu X, Liu D, Zhang J, Fan Y, Ouyang T, Luo Y, Zhang Y, Deng L. A comprehensive review and evaluation of graph neural networks for non-coding RNA and complex disease associations. Brief Bioinform 2023; 24:bbad410. [PMID: 37985451 DOI: 10.1093/bib/bbad410] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/07/2023] [Accepted: 10/25/2023] [Indexed: 11/22/2023] Open
Abstract
Non-coding RNAs (ncRNAs) play a critical role in the occurrence and development of numerous human diseases. Consequently, studying the associations between ncRNAs and diseases has garnered significant attention from researchers in recent years. Various computational methods have been proposed to explore ncRNA-disease relationships, with Graph Neural Network (GNN) emerging as a state-of-the-art approach for ncRNA-disease association prediction. In this survey, we present a comprehensive review of GNN-based models for ncRNA-disease associations. Firstly, we provide a detailed introduction to ncRNAs and GNNs. Next, we delve into the motivations behind adopting GNNs for predicting ncRNA-disease associations, focusing on data structure, high-order connectivity in graphs and sparse supervision signals. Subsequently, we analyze the challenges associated with using GNNs in predicting ncRNA-disease associations, covering graph construction, feature propagation and aggregation, and model optimization. We then present a detailed summary and performance evaluation of existing GNN-based models in the context of ncRNA-disease associations. Lastly, we explore potential future research directions in this rapidly evolving field. This survey serves as a valuable resource for researchers interested in leveraging GNNs to uncover the complex relationships between ncRNAs and diseases.
Collapse
Affiliation(s)
- Xiaowen Hu
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Dayun Liu
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical and Computer Engineering, University of California, San Diego,92093 CA, USA
| | - Yanhao Fan
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Tianxiang Ouyang
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Yue Luo
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Yuanpeng Zhang
- school of software, Xinjiang University, 830046 Urumqi, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| |
Collapse
|
20
|
Peng L, Huang L, Tian G, Wu Y, Li G, Cao J, Wang P, Li Z, Duan L. Predicting potential microbe-disease associations with graph attention autoencoder, positive-unlabeled learning, and deep neural network. Front Microbiol 2023; 14:1244527. [PMID: 37789848 PMCID: PMC10543759 DOI: 10.3389/fmicb.2023.1244527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 08/16/2023] [Indexed: 10/05/2023] Open
Abstract
Background Microbes have dense linkages with human diseases. Balanced microorganisms protect human body against physiological disorders while unbalanced ones may cause diseases. Thus, identification of potential associations between microbes and diseases can contribute to the diagnosis and therapy of various complex diseases. Biological experiments for microbe-disease association (MDA) prediction are expensive, time-consuming, and labor-intensive. Methods We developed a computational MDA prediction method called GPUDMDA by combining graph attention autoencoder, positive-unlabeled learning, and deep neural network. First, GPUDMDA computes disease similarity and microbe similarity matrices by integrating their functional similarity and Gaussian association profile kernel similarity, respectively. Next, it learns the feature representation of each microbe-disease pair using graph attention autoencoder based on the obtained disease similarity and microbe similarity matrices. Third, it selects a few reliable negative MDAs based on positive-unlabeled learning. Finally, it takes the learned MDA features and the selected negative MDAs as inputs and designed a deep neural network to predict potential MDAs. Results GPUDMDA was compared with four state-of-the-art MDA identification models (i.e., MNNMDA, GATMDA, LRLSHMDA, and NTSHMDA) on the HMDAD and Disbiome databases under five-fold cross validations on microbes, diseases, and microbe-disease pairs. Under the three five-fold cross validations, GPUDMDA computed the best AUCs of 0.7121, 0.9454, and 0.9501 on the HMDAD database and 0.8372, 0.8908, and 0.8948 on the Disbiome database, respectively, outperforming the other four MDA prediction methods. Asthma is the most common chronic respiratory condition and affects ~339 million people worldwide. Inflammatory bowel disease is a class of globally chronic intestinal disease widely existed in the gut and gastrointestinal tract and extraintestinal organs of patients. Particularly, inflammatory bowel disease severely affects the growth and development of children. We used the proposed GPUDMDA method and found that Enterobacter hormaechei had potential associations with both asthma and inflammatory bowel disease and need further biological experimental validation. Conclusion The proposed GPUDMDA demonstrated the powerful MDA prediction ability. We anticipate that GPUDMDA helps screen the therapeutic clues for microbe-related diseases.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
- College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China
| | - Liangliang Huang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Geng Tian
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yan Wu
- Geneis (Beijing) Co. Ltd., Beijing, China
| | - Guang Li
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| | - Jianying Cao
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| | - Peng Wang
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Lian Duan
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| |
Collapse
|
21
|
Peng W, Liu M, Dai W, Chen T, Fu Y, Pan Y. Multi-View Feature Aggregation for Predicting Microbe-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2748-2758. [PMID: 34871177 DOI: 10.1109/tcbb.2021.3132611] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes play a crucial role in human health and disease. Figuring out the relationship between microbes and diseases leads to significant potential applications in disease treatments. It is an urgent need to devise robust and effective computational methods for identifying disease-related microbes. This work proposes a Multi-View Feature Aggregation (MVFA) scheme that integrates the linear and nonlinear features to identify disease-related microbes. We introduce a non-negative matrix tri-factorization (NMTF) model to extract linear features for diseases and microbes. Then we learn another type of linear feature by utilizing a bi-random walk model. The nonlinear feature is obtained by inputting the two kinds of linear features into a capsule neural network. These three types of features describe the associations between diseases and microbes from different views. Finally, considering the complementary of these features, we leverage a logistic regression model to combine the NMTF model predictions, bi-random walk model predictions, and the capsule neural network predictions to obtain the final microbe-disease pair scores. We apply our method to predict human microbe-disease associations on two datasets. Experimental results show that our multi-view model outperforms the state-of-the-art models in recovering missing microbe-disease associations and predicting associations for new microbes. The ablation study shows that aggregating multi-view linear and nonlinear features can improve the prediction performance. Case studies on two diseases, i.e. Type 1 diabetes and Liver cirrhosis, further validate our method effectiveness.
Collapse
|
22
|
Li J, Wei C, Zhou T, Mo C, Wang G, He F, Wang P, Qin L, Peng F. A display and analysis platform for gut microbiomes of minority people and phenotypic data in China. Sci Rep 2023; 13:14247. [PMID: 37648696 PMCID: PMC10469205 DOI: 10.1038/s41598-023-36754-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 06/09/2023] [Indexed: 09/01/2023] Open
Abstract
The minority people panmicrobial community database (MPPCD website: http://mppmcdb.cloudna.cn/ ) is the first microbe-disease association database of Chinese ethnic minorities. To research the relationships between intestinal microbes and diseases/health in the ethnic minorities, we collected the microbes of the Han people for comparison. Based on the data, such as age, among the different ethnic groups of the different regions of Sichuan Province, MPPCD not only provided the gut microbial composition but also presented the relative abundance value at the phylum, class, order, family and genus levels in different groups. In addition, differential analysis was performed in different microbes in the two different groups, which contributed to exploring the difference in intestinal microbe structures between the two groups. Meanwhile, a series of related factors, including age, sex, body mass index, ethnicity, physical condition, and living altitude, were included in the MPPCD, with special focus on living altitude. To date, this is the first intestinal microbe database to introduce altitude features. In conclusion, we hope that MPPCD will serve as a fundamental research support for the relationship between human gut microbes and host health and disease, especially in ethnic minorities.
Collapse
Affiliation(s)
- Jun Li
- Department of Gastroenterology, The First Affiliated Hospital of Chengdu Medical College, 278# Bao Guang Road, Xindu District, Chengdu, 610000, Sichuan, People's Republic of China.
| | - Chunxue Wei
- Department of Gastroenterology, The First Affiliated Hospital of Chengdu Medical College, 278# Bao Guang Road, Xindu District, Chengdu, 610000, Sichuan, People's Republic of China
| | - Ting Zhou
- Department of Gastroenterology, The Sixth People's Hospital of Chengdu, Chengdu, Sichuan, China
| | - Chunfen Mo
- Department of Immunology, School of Basic Medical Sciences, Chengdu Medical College, Chengdu, Sichuan, China
| | - Guanjun Wang
- Department of Gastroenterology, The First Affiliated Hospital of Chengdu Medical College, 278# Bao Guang Road, Xindu District, Chengdu, 610000, Sichuan, People's Republic of China
| | - Feng He
- Department of Gastroenterology, The First Affiliated Hospital of Chengdu Medical College, 278# Bao Guang Road, Xindu District, Chengdu, 610000, Sichuan, People's Republic of China
| | - Pengyu Wang
- College of Pharmacy, Chengdu Medical College, Chengdu, Sichuan, China
| | - Ling Qin
- Department of Gastroenterology, The First Affiliated Hospital of Chengdu Medical College, 278# Bao Guang Road, Xindu District, Chengdu, 610000, Sichuan, People's Republic of China
| | - Fujun Peng
- Institute of Basic Medicine, Weifang Medical University, 7166# Baotong West Road, Weifang, 261053, Shandong, People's Republic of China.
| |
Collapse
|
23
|
Wang CY, Kuang X, Wang QQ, Zhang GQ, Cheng ZS, Deng ZX, Guo FB. GMMAD: a comprehensive database of human gut microbial metabolite associations with diseases. BMC Genomics 2023; 24:482. [PMID: 37620754 PMCID: PMC10464125 DOI: 10.1186/s12864-023-09599-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 08/17/2023] [Indexed: 08/26/2023] Open
Abstract
BACKGROUND The natural products, metabolites, of gut microbes are crucial effect factors on diseases. Comprehensive identification and annotation of relationships among disease, metabolites, and microbes can provide efficient and targeted solutions towards understanding the mechanism of complex disease and development of new markers and drugs. RESULTS We developed Gut Microbial Metabolite Association with Disease (GMMAD), a manually curated database of associations among human diseases, gut microbes, and metabolites of gut microbes. Here, this initial release (i) contains 3,836 disease-microbe associations and 879,263 microbe-metabolite associations, which were extracted from literatures and available resources and then experienced our manual curation; (ii) defines an association strength score and a confidence score. With these two scores, GMMAD predicted 220,690 disease-metabolite associations, where the metabolites all belong to the gut microbes. We think that the positive effective (with both scores higher than suggested thresholds) associations will help identify disease marker and understand the pathogenic mechanism from the sense of gut microbes. The negative effective associations would be taken as biomarkers and have the potential as drug candidates. Literature proofs supported our proposal with experimental consistence; (iii) provides a user-friendly web interface that allows users to browse, search, and download information on associations among diseases, metabolites, and microbes. The resource is freely available at http://guolab.whu.edu.cn/GMMAD . CONCLUSIONS As the online-available unique resource for gut microbial metabolite-disease associations, GMMAD is helpful for researchers to explore mechanisms of disease- metabolite-microbe and screen the drug and marker candidates for different diseases.
Collapse
Affiliation(s)
- Cheng-Yu Wang
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Xia Kuang
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Qiao-Qiao Wang
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Gu-Qin Zhang
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Zhen-Shun Cheng
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Zi-Xin Deng
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Feng-Biao Guo
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China.
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China.
| |
Collapse
|
24
|
Wang L, Wang Y, Xuan C, Zhang B, Wu H, Gao J. Predicting potential microbe-disease associations based on multi-source features and deep learning. Brief Bioinform 2023; 24:bbad255. [PMID: 37406190 DOI: 10.1093/bib/bbad255] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/30/2023] [Accepted: 06/20/2023] [Indexed: 07/07/2023] Open
Abstract
Studies have confirmed that the occurrence of many complex diseases in the human body is closely related to the microbial community, and microbes can affect tumorigenesis and metastasis by regulating the tumor microenvironment. However, there are still large gaps in the clinical observation of the microbiota in disease. Although biological experiments are accurate in identifying disease-associated microbes, they are also time-consuming and expensive. The computational models for effective identification of diseases related microbes can shorten this process, and reduce capital and time costs. Based on this, in the paper, a model named DSAE_RF is presented to predict latent microbe-disease associations by combining multi-source features and deep learning. DSAE_RF calculates four similarities between microbes and diseases, which are then used as feature vectors for the disease-microbe pairs. Later, reliable negative samples are screened by k-means clustering, and a deep sparse autoencoder neural network is further used to extract effective features of the disease-microbe pairs. In this foundation, a random forest classifier is presented to predict the associations between microbes and diseases. To assess the performance of the model in this paper, 10-fold cross-validation is implemented on the same dataset. As a result, the AUC and AUPR of the model are 0.9448 and 0.9431, respectively. Furthermore, we also conduct a variety of experiments, including comparison of negative sample selection methods, comparison with different models and classifiers, Kolmogorov-Smirnov test and t-test, ablation experiments, robustness analysis, and case studies on Covid-19 and colorectal cancer. The results fully demonstrate the reliability and availability of our model.
Collapse
Affiliation(s)
- Liugen Wang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Yan Wang
- School of Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Chenxu Xuan
- School of Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Bai Zhang
- School of Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Hanwen Wu
- School of Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| | - Jie Gao
- School of Science, Jiangnan University, Wuxi, Jiangsu 214122, China
| |
Collapse
|
25
|
Karkera N, Acharya S, Palaniappan SK. Leveraging pre-trained language models for mining microbiome-disease relationships. BMC Bioinformatics 2023; 24:290. [PMID: 37468830 PMCID: PMC10357883 DOI: 10.1186/s12859-023-05411-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/13/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND The growing recognition of the microbiome's impact on human health and well-being has prompted extensive research into discovering the links between microbiome dysbiosis and disease (healthy) states. However, this valuable information is scattered in unstructured form within biomedical literature. The structured extraction and qualification of microbe-disease interactions are important. In parallel, recent advancements in deep-learning-based natural language processing algorithms have revolutionized language-related tasks such as ours. This study aims to leverage state-of-the-art deep-learning language models to extract microbe-disease relationships from biomedical literature. RESULTS In this study, we first evaluate multiple pre-trained large language models within a zero-shot or few-shot learning context. In this setting, the models performed poorly out of the box, emphasizing the need for domain-specific fine-tuning of these language models. Subsequently, we fine-tune multiple language models (specifically, GPT-3, BioGPT, BioMedLM, BERT, BioMegatron, PubMedBERT, BioClinicalBERT, and BioLinkBERT) using labeled training data and evaluate their performance. Our experimental results demonstrate the state-of-the-art performance of these fine-tuned models ( specifically GPT-3, BioMedLM, and BioLinkBERT), achieving an average F1 score, precision, and recall of over [Formula: see text] compared to the previous best of 0.74. CONCLUSION Overall, this study establishes that pre-trained language models excel as transfer learners when fine-tuned with domain and problem-specific data, enabling them to achieve state-of-the-art results even with limited training data for extracting microbiome-disease interactions from scientific publications.
Collapse
Affiliation(s)
| | - Sathwik Acharya
- The Systems Biology Institute, Tokyo, Japan
- PES University, Bengaluru, India
| | - Sucheendra K Palaniappan
- The Systems Biology Institute, Tokyo, Japan.
- Iom Bioworks Pvt Ltd., Bengaluru, India.
- SBX Corporation, Tokyo, Japan.
| |
Collapse
|
26
|
Wang F, Yang H, Wu Y, Peng L, Li X. SAELGMDA: Identifying human microbe-disease associations based on sparse autoencoder and LightGBM. Front Microbiol 2023; 14:1207209. [PMID: 37415823 PMCID: PMC10320730 DOI: 10.3389/fmicb.2023.1207209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/18/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Identification of complex associations between diseases and microbes is important to understand the pathogenesis of diseases and design therapeutic strategies. Biomedical experiment-based Microbe-Disease Association (MDA) detection methods are expensive, time-consuming, and laborious. Methods Here, we developed a computational method called SAELGMDA for potential MDA prediction. First, microbe similarity and disease similarity are computed by integrating their functional similarity and Gaussian interaction profile kernel similarity. Second, one microbe-disease pair is presented as a feature vector by combining the microbe and disease similarity matrices. Next, the obtained feature vectors are mapped to a low-dimensional space based on a Sparse AutoEncoder. Finally, unknown microbe-disease pairs are classified based on Light Gradient boosting machine. Results The proposed SAELGMDA method was compared with four state-of-the-art MDA methods (MNNMDA, GATMDA, NTSHMDA, and LRLSHMDA) under five-fold cross validations on diseases, microbes, and microbe-disease pairs on the HMDAD and Disbiome databases. The results show that SAELGMDA computed the best accuracy, Matthews correlation coefficient, AUC, and AUPR under the majority of conditions, outperforming the other four MDA prediction models. In particular, SAELGMDA obtained the best AUCs of 0.8358 and 0.9301 under cross validation on diseases, 0.9838 and 0.9293 under cross validation on microbes, and 0.9857 and 0.9358 under cross validation on microbe-disease pairs on the HMDAD and Disbiome databases. Colorectal cancer, inflammatory bowel disease, and lung cancer are diseases that severely threat human health. We used the proposed SAELGMDA method to find possible microbes for the three diseases. The results demonstrate that there are potential associations between Clostridium coccoides and colorectal cancer and one between Sphingomonadaceae and inflammatory bowel disease. In addition, Veillonella may associate with autism. The inferred MDAs need further validation. Conclusion We anticipate that the proposed SAELGMDA method contributes to the identification of new MDAs.
Collapse
Affiliation(s)
- Feixiang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Huandong Yang
- Department of Gastrointestinal Surgery, Yidu Central Hospital of Weifang, Weifang, China
| | - Yan Wu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiaoling Li
- The Second Department of Oncology, Beidahuang Industry Group General Hospital, Harbin, China
- The Second Department of Oncology, Heilongjiang Second Cancer Hospital, Harbin, China
| |
Collapse
|
27
|
Shen K, Din AU, Sinha B, Zhou Y, Qian F, Shen B. Translational informatics for human microbiota: data resources, models and applications. Brief Bioinform 2023; 24:7152256. [PMID: 37141135 DOI: 10.1093/bib/bbad168] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 05/05/2023] Open
Abstract
With the rapid development of human intestinal microbiology and diverse microbiome-related studies and investigations, a large amount of data have been generated and accumulated. Meanwhile, different computational and bioinformatics models have been developed for pattern recognition and knowledge discovery using these data. Given the heterogeneity of these resources and models, we aimed to provide a landscape of the data resources, a comparison of the computational models and a summary of the translational informatics applied to microbiota data. We first review the existing databases, knowledge bases, knowledge graphs and standardizations of microbiome data. Then, the high-throughput sequencing techniques for the microbiome and the informatics tools for their analyses are compared. Finally, translational informatics for the microbiome, including biomarker discovery, personalized treatment and smart healthcare for complex diseases, are discussed.
Collapse
Affiliation(s)
- Ke Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Ahmad Ud Din
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Baivab Sinha
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Yi Zhou
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Fuliang Qian
- Center for Systems Biology, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| |
Collapse
|
28
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
29
|
Jiang C, Tang M, Jin S, Huang W, Liu X. KGNMDA: A Knowledge Graph Neural Network Method for Predicting Microbe-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1147-1155. [PMID: 35724280 DOI: 10.1109/tcbb.2022.3184362] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Accumulated studies discovered that various microbes in human bodies were closely related to complex human diseases and could provide new insight into drug development. Multiple computational methods were constructed to predict microbes that were potentially associated with diseases. However, most previous methods were based on single characteristics of microbes or diseases, that lacked important biological information related to microorganisms or diseases. Therefore, we constructed a knowledge graph centered on microorganisms and diseases from several existed databases to provide knowledgeable information for microbes and diseases. Then, we adopted a graph neural network method to learn representations of microbes and diseases from the constructed knowledge graph. After that, we introduced the Gaussian kernel similarity features of microbes and diseases to generate final representations of microbes and diseases. At last, we proposed a score function on final representations of microbes and diseases to predict scores of microbe-disease associations. Comprehensive experiments on the Human Microbe-Disease Association Database (HMDAD) dataset had demonstrated that our approach outperformed baseline methods. Furthermore, we implemented case studies on two important diseases (asthma and inflammatory bowel disease), the result demonstrated that our proposed model was effective in revealing the relationship between diseases and microbes. The source code of our model and the data were available on https://github.com/ChangzhiJiang/KGNMDA_master.
Collapse
|
30
|
Shi K, Li L, Wang Z, Chen H, Chen Z, Fang S. Identifying microbe-disease association based on graph convolutional attention network: Case study of liver cirrhosis and epilepsy. Front Neurosci 2023; 16:1124315. [PMID: 36741060 PMCID: PMC9892757 DOI: 10.3389/fnins.2022.1124315] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 12/31/2022] [Indexed: 01/20/2023] Open
Abstract
The interactions between the microbiota and the human host can affect the physiological functions of organs (such as the brain, liver, gut, etc.). Accumulating investigations indicate that the imbalance of microbial community is closely related to the occurrence and development of diseases. Thus, the identification of potential links between microbes and diseases can provide insight into the pathogenesis of diseases. In this study, we propose a deep learning framework (MDAGCAN) based on graph convolutional attention network to identify potential microbe-disease associations. In MDAGCAN, we first construct a heterogeneous network consisting of the known microbe-disease associations and multi-similarity fusion networks of microbes and diseases. Then, the node embeddings considering the neighbor information of the heterogeneous network are learned by applying graph convolutional layers and graph attention layers. Finally, a bilinear decoder using node embedding representations reconstructs the unknown microbe-disease association. Experiments show that our method achieves reliable performance with average AUCs of 0.9778 and 0.9454 ± 0.0038 in the frameworks of Leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, we apply MDAGCAN to predict latent microbes for two high-risk human diseases, i.e., liver cirrhosis and epilepsy, and results illustrate that 16 and 17 out of the top 20 predicted microbes are verified by published literatures, respectively. In conclusion, our method displays effective and reliable prediction performance and can be expected to predict unknown microbe-disease associations facilitating disease diagnosis and prevention.
Collapse
Affiliation(s)
- Kai Shi
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin, China
| | - Lin Li
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Zhengfeng Wang
- College of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Huazhou Chen
- College of Science, Guilin University of Technology, Guilin, China
| | - Zilin Chen
- Department of Developmental and Behavioural Pediatric Department & Department of Child Primary Care, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shuanfeng Fang
- Department of Children Health Care, Children’s Hospital Affiliated to Zhengzhou University, Zhengzhou, China
| |
Collapse
|
31
|
Liu H, Bing P, Zhang M, Tian G, Ma J, Li H, Bao M, He K, He J, He B, Yang J. MNNMDA: Predicting human microbe-disease association via a method to minimize matrix nuclear norm. Comput Struct Biotechnol J 2023; 21:1414-1423. [PMID: 36824227 PMCID: PMC9941872 DOI: 10.1016/j.csbj.2022.12.053] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 12/29/2022] [Accepted: 12/30/2022] [Indexed: 01/03/2023] Open
Abstract
Identifying the potential associations between microbes and diseases is the first step for revealing the pathological mechanisms of microbe-associated diseases. However, traditional culture-based microbial experiments are expensive and time-consuming. Thus, it is critical to prioritize disease-associated microbes by computational methods for further experimental validation. In this study, we proposed a novel method called MNNMDA, to predict microbe-disease associations (MDAs) by applying a Matrix Nuclear Norm method into known microbe and disease data. Specifically, we first calculated Gaussian interaction profile kernel similarity and functional similarity for diseases and microbes. Then we constructed a heterogeneous information network by combining the integrated disease similarity network, the integrated microbe similarity network and the known microbe-disease bipartite network. Finally, we formulated the microbe-disease association prediction problem as a low-rank matrix completion problem, which was solved by minimizing the nuclear norm of a matrix with a few regularization terms. We tested the performances of MNNMDA in three datasets including HMDAD, Disbiome, and Combined Data with small, medium and large sizes respectively. We also compared MNNMDA with 5 state-of-the-art methods including KATZHMDA, LRLSHMDA, NTSHMDA, GATMDA, and KGNMDA, respectively. MNNMDA achieved area under the ROC curves (AUROC) of 0.9536 and 0.9364 respectively on HDMAD and Disbiome, better than the AUCs of compared methods under the 5-fold cross-validation for all microbe-disease associations. It also obtained a relatively good performance with AUROC 0.8858 in the combined data. In addition, MNNMDA was also better than other methods in area under precision and recall curve (AUPR) under the 5-fold cross-validation for all associations, and in both AUROC and AUPR under the 5-fold cross-validation for diseases and the 5-fold cross-validation for microbes. Finally, the case studies on colon cancer and inflammatory bowel disease (IBD) also validated the effectiveness of MNNMDA. In conclusion, MNNMDA is an effective method in predicting microbe-disease associations. Availability The codes and data for this paper are freely available at Github https://github.com/Haiyan-Liu666/MNNMDA.
Collapse
Affiliation(s)
- Haiyan Liu
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,College of Information Engineering, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China
| | - Meijun Zhang
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, PR China
| | - Jun Ma
- College of Information Engineering, Changsha Medical University, Changsha 410219, PR China
| | - Haigang Li
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Meihua Bao
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Kunhui He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China
| | - Jianjun He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha 410219, PR China,Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha 410219, PR China,Geneis Beijing Co., Ltd., Beijing 100102, PR China,School of pharmacy, Changsha Medical University, Changsha 410219, PR China,Corresponding authors at: Academician Workstation, Changsha Medical University, Changsha 410219, PR China.
| |
Collapse
|
32
|
Hu W, Yang X, Wang L, Zhu X. MADGAN:A microbe-disease association prediction model based on generative adversarial networks. Front Microbiol 2023; 14:1159076. [PMID: 37032881 PMCID: PMC10076708 DOI: 10.3389/fmicb.2023.1159076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 03/02/2023] [Indexed: 04/11/2023] Open
Abstract
Researches have demonstrated that microorganisms are indispensable for the nutrition transportation, growth and development of human bodies, and disorder and imbalance of microbiota may lead to the occurrence of diseases. Therefore, it is crucial to study relationships between microbes and diseases. In this manuscript, we proposed a novel prediction model named MADGAN to infer potential microbe-disease associations by combining biological information of microbes and diseases with the generative adversarial networks. To our knowledge, it is the first attempt to use the generative adversarial network to complete this important task. In MADGAN, we firstly constructed different features for microbes and diseases based on multiple similarity metrics. And then, we further adopted graph convolution neural network (GCN) to derive different features for microbes and diseases automatically. Finally, we trained MADGAN to identify latent microbe-disease associations by games between the generation network and the decision network. Especially, in order to prevent over-smoothing during the model training process, we introduced the cross-level weight distribution structure to enhance the depth of the network based on the idea of residual network. Moreover, in order to validate the performance of MADGAN, we conducted comprehensive experiments and case studies based on databases of HMDAD and Disbiome respectively, and experimental results demonstrated that MADGAN not only achieved satisfactory prediction performances, but also outperformed existing state-of-the-art prediction models.
Collapse
Affiliation(s)
- Weixin Hu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
| | - Xiaoyu Yang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
| | - Lei Wang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, China
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
- *Correspondence: Lei Wang,
| | - Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
- Xianyou Zhu,
| |
Collapse
|
33
|
Yang X, Xu W, Leng D, Wen Y, Wu L, Li R, Huang J, Bo X, He S. Exploring novel disease-disease associations based on multi-view fusion network. Comput Struct Biotechnol J 2023; 21:1807-1819. [PMID: 36923471 PMCID: PMC10009443 DOI: 10.1016/j.csbj.2023.02.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/02/2023] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
Established taxonomy system based on disease symptom and tissue characteristics have provided an important basis for physicians to correctly identify diseases and treat them successfully. However, these classifications tend to be based on phenotypic observations, lacking a molecular biological foundation. Therefore, there is an urgent to integrate multi-dimensional molecular biological information or multi-omics data to redefine disease classification in order to provide a powerful perspective for understanding the molecular structure of diseases. Therefore, we offer a flexible disease classification that integrates the biological process, gene expression, and symptom phenotype of diseases, and propose a disease-disease association network based on multi-view fusion. We applied the fusion approach to 223 diseases and divided them into 24 disease clusters. The contribution of internal and external edges of disease clusters were analyzed. The results of the fusion model were compared with Medical Subject Headings, a traditional and commonly used disease taxonomy. Then, experimental results of model performance comparison show that our approach performs better than other integration methods. As it was observed, the obtained clusters provided more interesting and novel disease-disease associations. This multi-view human disease association network describes relationships between diseases based on multiple molecular levels, thus breaking through the limitation of the disease classification system based on tissues and organs. This approach which motivates clinicians and researchers to reposition the understanding of diseases and explore diagnosis and therapy strategies, extends the existing disease taxonomy. Availability of data and materials The preprocessed dataset and source code supporting the conclusions of this article are available at GitHub repository https://github.com/yangxiaoxi89/mvHDN.
Collapse
Affiliation(s)
- Xiaoxi Yang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China.,Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Wenjian Xu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China.,Rare Disease Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing 100045, China.,MOE Key Laboratory of Major Diseases in Children, Beijing 100045, China.,Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing 100045, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Ruijiang Li
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Jian Huang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
34
|
Liu JX, Yin MM, Gao YL, Shang J, Zheng CH. MSF-LRR: Multi-Similarity Information Fusion Through Low-Rank Representation to Predict Disease-Associated Microbes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:534-543. [PMID: 35085090 DOI: 10.1109/tcbb.2022.3146176] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
An Increase in microbial activity is shown to be intimately connected with the pathogenesis of diseases. Considering the expense of traditional verification methods, researchers are working to develop high-efficiency methods for detecting potential disease-related microbes. In this article, a new prediction method, MSF-LRR, is established, which uses Low-Rank Representation (LRR) to perform multi-similarity information fusion to predict disease-related microbes. Considering that most existing methods only use one class of similarity, three classes of microbe and disease similarity are added. Then, LRR is used to obtain low-rank structural similarity information. Additionally, the method adaptively extracts the local low-rank structure of the data from a global perspective, to make the information used for the prediction more effective. Finally, a neighbor-based prediction method that utilizes the concept of collaborative filtering is applied to predict unknown microbe-disease pairs. As a result, the AUC value of MSF-LRR is superior to other existing algorithms under 5-fold cross-validation. Furthermore, in case studies, excluding originally known associations, 16 and 19 of the top 20 microbes associated with Bacterial Vaginosis and Irritable Bowel Syndrome, respectively, have been confirmed by the recent literature. In summary, MSF-LRR is a good predictor of potential microbe-disease associations and can contribute to drug discovery and biological research.
Collapse
|
35
|
Gong H, You X, Jin M, Meng Y, Zhang H, Yang S, Xu J. Graph neural network and multi-data heterogeneous networks for microbe-disease prediction. Front Microbiol 2022; 13:1077111. [PMID: 36620040 PMCID: PMC9814480 DOI: 10.3389/fmicb.2022.1077111] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
The research on microbe association networks is greatly significant for understanding the pathogenic mechanism of microbes and promoting the application of microbes in precision medicine. In this paper, we studied the prediction of microbe-disease associations based on multi-data biological network and graph neural network algorithm. The HMDAD database provided a dataset that included 39 diseases, 292 microbes, and 450 known microbe-disease associations. We proposed a Microbe-Disease Heterogeneous Network according to the microbe similarity network, disease similarity network, and known microbe-disease associations. Furthermore, we integrated the network into the graph convolutional neural network algorithm and developed the GCNN4Micro-Dis model to predict microbe-disease associations. Finally, the performance of the GCNN4Micro-Dis model was evaluated via 5-fold cross-validation. We randomly divided all known microbe-disease association data into five groups. The results showed that the average AUC value and standard deviation were 0.8954 ± 0.0030. Our model had good predictive power and can help identify new microbe-disease associations. In addition, we compared GCNN4Micro-Dis with three advanced methods to predict microbe-disease associations, KATZHMDA, BiRWHMDA, and LRLSHMDA. The results showed that our method had better prediction performance than the other three methods. Furthermore, we selected breast cancer as a case study and found the top 12 microbes related to breast cancer from the intestinal flora of patients, which further verified the model's accuracy.
Collapse
Affiliation(s)
- Houwu Gong
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Academy of Military Sciences, Beijing, China
| | - Xiong You
- Center of Rehabilitation Diagnosis and Treatment, Hunan Provincial Rehabilitation Hospital, Changsha, China
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,*Correspondence: Min Jin, ✉
| | - Yajie Meng
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, China
| | - Hanxue Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shuaishuai Yang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China,Junlin Xu, ✉
| |
Collapse
|
36
|
Minadakis G, Tomazou M, Dietis N, Spyrou GM. Vir2Drug: a drug repurposing framework based on protein similarities between pathogens. Brief Bioinform 2022; 24:6895455. [PMID: 36513376 PMCID: PMC9851336 DOI: 10.1093/bib/bbac536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/25/2022] [Accepted: 11/08/2022] [Indexed: 12/15/2022] Open
Abstract
We draw from the assumption that similarities between pathogens at both pathogen protein and host protein level, may provide the appropriate framework to identify and rank candidate drugs to be used against a specific pathogen. Vir2Drug is a drug repurposing tool that uses network-based approaches to identify and rank candidate drugs for a specific pathogen, combining information obtained from: (a) ranked pathogen-to-pathogen networks based on protein similarities between pathogens, (b) taxonomy distance between pathogens and (c) drugs targeting specific pathogen's and host proteins. The underlying pathogen networks are used to screen drugs by means of specific methodologies that account for either the host or pathogen's protein targets. Vir2Drug is a useful and yet informative tool for drug repurposing against known or unknown pathogens especially in periods where the emergence for repurposed drugs plays significant role in handling viral outbreaks, until reaching a vaccine. The web tool is available at: https://bioinformatics.cing.ac.cy/vir2drug, https://vir2drug.cing-big.hpcf.cyi.ac.cy.
Collapse
Affiliation(s)
- George Minadakis
- Corresponding author: George Minadakis, Bioinformatics Department, The Cyprus Institute of Neurology & Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, PO Box 23462, 1683 Nicosia, Cyprus. Tel.: +357-22-392852; Fax: +357-22-358238; E-mail:
| | - Marios Tomazou
- Bioinformatics Department, The Cyprus Institute of Neurology & Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
- PO Box 23462, 1683 Nicosia, Cyprus,The Cyprus School of Molecular Medicine, 6 Iroon Avenue, 2371 Ayios Dometios, PO Box 23462, 1683 Nicosia, Cyprus
| | - Nikolas Dietis
- Medical School, University of Cyprus, Nicosia 1678, Cyprus
| | - George M Spyrou
- Bioinformatics Department, The Cyprus Institute of Neurology & Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
- PO Box 23462, 1683 Nicosia, Cyprus,The Cyprus School of Molecular Medicine, 6 Iroon Avenue, 2371 Ayios Dometios, PO Box 23462, 1683 Nicosia, Cyprus
| |
Collapse
|
37
|
Liu D, Liu J, Luo Y, He Q, Deng L. MGATMDA: Predicting Microbe-Disease Associations via Multi-Component Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3578-3585. [PMID: 34587092 DOI: 10.1109/tcbb.2021.3116318] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Microbes are parasitic in various human body organs and play significant roles in a wide range of diseases. Identifying microbe-disease associations is conducive to the identification of potential drug targets. Considering the high cost and risk of biological experiments, developing computational approaches to explore the relationship between microbes and diseases is an alternative choice. However, most existing methods are based on unreliable or noisy similarity, and the prediction accuracy could be affected. Besides, it is still a great challenge for most previous methods to make predictions for the large-scale dataset. In this work, we develop a multi-component Graph Attention Network (GAT) based framework, termed MGATMDA, for predicting microbe-disease associations. MGATMDA is built on a bipartite graph of microbes and diseases. It contains three essential parts: decomposer, combiner, and predictor. The decomposer first decomposes the edges in the bipartite graph to identify the latent components by node-level attention mechanism. The combiner then recombines these latent components automatically to obtain unified embedding for prediction by component-level attention mechanism. Finally, a fully connected network is used to predict unknown microbes-disease associations. Experimental results showed that our proposed method outperformed eight state-of-the-art methods. Case studies for two common diseases further demonstrated the effectiveness of MGATMDA in predicting potential microbe-disease associations. The codes are available at Github https://github.com/dayunliu/MGATMDA.
Collapse
|
38
|
Hua M, Yu S, Liu T, Yang X, Wang H. MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes. Interdiscip Sci 2022; 14:669-682. [PMID: 35428964 DOI: 10.1007/s12539-022-00514-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/06/2022] [Accepted: 03/13/2022] [Indexed: 06/14/2023]
Abstract
MOTIVATION Exploring the interrelationships between microbes and disease can help microbiologists make decisions and plan treatments. Predicting new microbe-disease associations currently relies on biological experiments and domain knowledge, which is time-consuming and inefficient. Automated algorithms are used to uncover the intrinsic link between microbes and disease. However, due to data noise and inadequate understanding of relevant biology, the efficient prediction of microbe-disease associations is still crucial. This study develops a multi-view graph augmentation convolutional network (MVGCNMDA) to predict potential disease-associated microbes. METHODS First, we use two data augmentation methods, edge perturbation and node dropping, to remove the data noise in the preprocessing stage. Second, we calculate Gaussian interaction profile kernel similarity and cosine similarity. Therefore, the Graph Convolutional Network(GCN) can fully use multi-view features. Then, the multi-view features are fed into the multi-attention block to learn the weights of different features adaptively. Finally, the embedding results are obtained using a Convolutional Neural Network (CNN) combiner, and the matrix completion is used to predict the relationship between potential microbes and diseases. RESULTS We test our model on the Human microbe-disease Association Database (HMDAD), Disbiome, and the Combined Dataset (Peryton and MicroPhenoDB). The area under PR curve (AUPR), area under ROC curve (AUC), F1 score, and RECALL value are calculated to evaluate the performance of the developed MVGCNMDA. The AUPR is 0.9440, AUC is 0.9428, F1 score is 0.9383, and RECALL value is 0.8858. The experiments show that our model can accurately predict potential microbe-disease associations compared with the state-of-the-art works on the global Leave-One-Out-Cross-Validation (LOOCV) and the fivefold Cross-Validation (fivefold CV). To further verify the effectiveness of the proposed graph data augmentation, we designed five different settings in the ablation study. Furthermore, we present two case studies that validate the prediction of the potential association between microbes and diseases by MVGCNMDA.
Collapse
Affiliation(s)
- Meifang Hua
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Tianyu Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Xue Yang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China.
| |
Collapse
|
39
|
Yang M, Huang ZA, Gu W, Han K, Pan W, Yang X, Zhu Z. Prediction of biomarker-disease associations based on graph attention network and text representation. Brief Bioinform 2022; 23:6651308. [PMID: 35901464 DOI: 10.1093/bib/bbac298] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers. RESULTS Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods. AVAILABILITY The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.
Collapse
Affiliation(s)
- Minghao Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Wenhao Gu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.,GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Kun Han
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Wenying Pan
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Xiao Yang
- GeneGenieDx Corp, 160 E Tasman Dr, San Jose, CA 95134
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China
| |
Collapse
|
40
|
Wang Y, Lei X, Lu C, Pan Y. Predicting Microbe-Disease Association Based on Multiple Similarities and LINE Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2399-2408. [PMID: 34014827 DOI: 10.1109/tcbb.2021.3082183] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Numerous microbes have been found to have vital impacts on human health through affecting biological processes. Therefore, exploring potential associations between microbes and diseases will promote the understanding and diagnosis of diseases. In this study, we present a novel computational model, named MSLINE, to infer potential microbe-disease associations by integrating Multiple Similarities and Large-scale Information Network Embedding (LINE) based on known associations. Specifically, on the basis of known microbe-disease associations from the Human Microbe-Disease Association Database, we first increase the known associations by collecting proven associations from existing literatures. We then construct a microbe-disease heterogeneous network (MDHN) by integrating known associations and multiple similarities (including Gaussian interaction profile kernel similarity, microbe function similarity, disease semantic similarity and disease-symptom similarity). After that, we implement random walk and LINE algorithm on MDHN to learn its structure information. Finally, we score the microbe-disease associations according to the structure information for every nodes. In the Leave-one-out cross validation and 5-fold cross validation, MSLINE performs better compared to other existing methods. Moreover, case studies of different diseases proved that MSLINE could predict the potential microbe-disease associations efficiently.
Collapse
|
41
|
Chen Y, Lei X. Metapath Aggregated Graph Neural Network and Tripartite Heterogeneous Networks for Microbe-Disease Prediction. Front Microbiol 2022; 13:919380. [PMID: 35711758 PMCID: PMC9194683 DOI: 10.3389/fmicb.2022.919380] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 04/29/2022] [Indexed: 11/25/2022] Open
Abstract
More and more studies have shown that understanding microbe-disease associations cannot only reveal the pathogenesis of diseases, but also promote the diagnosis and prognosis of diseases. Because traditional medical experiments are time-consuming and expensive, many computational methods have been proposed in recent years to identify potential microbe-disease associations. In this study, we propose a method based on heterogeneous network and metapath aggregated graph neural network (MAGNN) to predict microbe-disease associations, called MATHNMDA. First, we introduce microbe-drug interactions, drug-disease associations, and microbe-disease associations to construct a microbe-drug-disease heterogeneous network. Then we take the heterogeneous network as input to MAGNN. Second, for each layer of MAGNN, we carry out intra-metapath aggregation with a multi-head attention mechanism to learn the structural and semantic information embedded in the target node context, the metapath-based neighbor nodes, and the context between them, by encoding the metapath instances under the metapath definition mode. We then use inter-metapath aggregation with an attention mechanism to combine the semantic information of all different metapaths. Third, we can get the final embedding of microbe nodes and disease nodes based on the output of the last layer in the MAGNN. Finally, we predict potential microbe-disease associations by reconstructing the microbe-disease association matrix. In addition, we evaluated the performance of MATHNMDA by comparing it with that of its variants, some state-of-the-art methods, and different datasets. The results suggest that MATHNMDA is an effective prediction method. The case studies on asthma, inflammatory bowel disease (IBD), and coronavirus disease 2019 (COVID-19) further validate the effectiveness of MATHNMDA.
Collapse
Affiliation(s)
- Yali Chen
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
42
|
Wang CY, Wen QF, Wang QQ, Kuang X, Dong C, Deng ZX, Guo FB. Discovery of Drug Candidates for Specific Human Disease Based on Natural Products of Gut Microbes. Front Microbiol 2022; 13:896740. [PMID: 35783383 PMCID: PMC9240467 DOI: 10.3389/fmicb.2022.896740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/19/2022] [Indexed: 11/24/2022] Open
Abstract
The beneficial metabolites of the microbiome could be used as a tool for screening drugs that have the potential for the therapy of various human diseases. Narrowing down the range of beneficial metabolite candidates in specific diseases was primarily a key step for further validation in model organisms. Herein, we proposed a reasonable hypothesis that the metabolites existing commonly in multiple beneficial (or negatively associated) bacteria might have a high probability of being effective drug candidates for specific diseases. According to this hypothesis, we screened metabolites associated with seven human diseases. For type I diabetes, 45 out of 88 screened metabolites had been reported as potential drugs in the literature. Meanwhile, 18 of these metabolites were specific to type I diabetes. Additionally, metabolite correlation could reflect disease relationships in some sense. Our results have demonstrated the potential of bioinformatics mining gut microbes' metabolites as drug candidates based on reported numerous microbe-disease associations and the Virtual Metabolic Human database. More subtle methods would be developed to ensure more accurate predictions.
Collapse
Affiliation(s)
- Cheng-Yu Wang
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Qing-Feng Wen
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Qiao-Qiao Wang
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Xia Kuang
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Chuan Dong
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Zi-Xin Deng
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Feng-Biao Guo
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
- *Correspondence: Feng-Biao Guo
| |
Collapse
|
43
|
Yin MM, Liu JX, Gao YL, Kong XZ, Zheng CH. NCPLP: A Novel Approach for Predicting Microbe-Associated Diseases With Network Consistency Projection and Label Propagation. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5079-5087. [PMID: 33119529 DOI: 10.1109/tcyb.2020.3026652] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A growing number of clinical studies have provided substantial evidence of a close relationship between the microbe and the disease. Thus, it is necessary to infer potential microbe-disease associations. But traditional approaches use experiments to validate these associations that often spend a lot of materials and time. Hence, more reliable computational methods are expected to be applied to predict disease-associated microbes. In this article, an innovative mean for predicting microbe-disease associations is proposed, which is based on network consistency projection and label propagation (NCPLP). Given that most existing algorithms use the Gaussian interaction profile (GIP) kernel similarity as the similarity criterion between microbe pairs and disease pairs, in this model, Medical Subject Headings descriptors are considered to calculate disease semantic similarity. In addition, 16S rRNA gene sequences are borrowed for the calculation of microbe functional similarity. In view of the gene-based sequence information, we use two conventional methods (BLAST+ and MEGA7) to assess the similarity between each pair of microbes from different perspectives. Especially, network consistency projection is added to obtain network projection scores from the microbe space and the disease space. Ultimately, label propagation is utilized to reliably predict microbes related to diseases. NCPLP achieves better performance in various evaluation indicators and discovers a greater number of potential associations between microbes and diseases. Also, case studies further confirm the reliable prediction performance of NCPLP. To conclude, our algorithm NCPLP has the ability to discover these underlying microbe-disease associations and can provide help for biological study.
Collapse
|
44
|
Gu W, Yang X, Yang M, Han K, Pan W, Zhu Z. MarkerGenie: an NLP-enabled text-mining system for biomedical entity relation extraction. BIOINFORMATICS ADVANCES 2022; 2:vbac035. [PMID: 36699388 PMCID: PMC9710573 DOI: 10.1093/bioadv/vbac035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/04/2022] [Accepted: 05/09/2022] [Indexed: 01/28/2023]
Abstract
Motivation Natural language processing (NLP) tasks aim to convert unstructured text data (e.g. articles or dialogues) to structured information. In recent years, we have witnessed fundamental advances of NLP technique, which has been widely used in many applications such as financial text mining, news recommendation and machine translation. However, its application in the biomedical space remains challenging due to a lack of labeled data, ambiguities and inconsistencies of biological terminology. In biomedical marker discovery studies, tools that rely on NLP models to automatically and accurately extract relations of biomedical entities are valuable as they can provide a more thorough survey of all available literature, hence providing a less biased result compared to manual curation. In addition, the fast speed of machine reader helps quickly orient research and development. Results To address the aforementioned needs, we developed automatic training data labeling, rule-based biological terminology cleaning and a more accurate NLP model for binary associative and multi-relation prediction into the MarkerGenie program. We demonstrated the effectiveness of the proposed methods in identifying relations between biomedical entities on various benchmark datasets and case studies. Availability and implementation MarkerGenie is available at https://www.genegeniedx.com/markergenie/. Data for model training and evaluation, term lists of biomedical entities, details of the case studies and all trained models are provided at https://drive.google.com/drive/folders/14RypiIfIr3W_K-mNIAx9BNtObHSZoAyn?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Wenhao Gu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China,GeneGenieDx Corp, San Jose, CA 95134, USA
| | - Xiao Yang
- GeneGenieDx Corp, San Jose, CA 95134, USA
| | - Minhao Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Kun Han
- GeneGenieDx Corp, San Jose, CA 95134, USA
| | | | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China,To whom correspondence should be addressed.
| |
Collapse
|
45
|
Toor R, Chana I. Exploring diet associations with Covid-19 and other diseases: a Network Analysis-based approach. Med Biol Eng Comput 2022; 60:991-1013. [PMID: 35171411 PMCID: PMC8852958 DOI: 10.1007/s11517-022-02505-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/10/2022] [Indexed: 02/07/2023]
Abstract
The current global pandemic, Covid-19, is a severe threat to human health and existence especially when it is mutating very frequently. Being a novel disease, Covid-19 is impacting the patients with comorbidities and is predicted to have long-term consequences, even for those who have recovered from it. To clearly recognize its impact, it is important to comprehend the complex relationship between Covid-19 and other diseases. It is also being observed that people with good immune system are less susceptible to the disease. It is perceived that if a correlation between Covid-19, other diseases, and diet is realized, then caregivers would be able to enhance their further course of medical action and recommendations. Network Analysis is one such technique that can bring forth such complex interdependencies and associations. In this paper, a Network Analysis-based approach has been proposed for analyzing the interplay of diets/foods along with Covid-19 and other diseases. Relationships between Covid-19, diabetes mellitus type 2 (T2DM), non-alcoholic fatty liver disease (NAFLD), and diets have been curated, visualized, and further analyzed in this study so as to predict unknown associations. Network algorithms including Louvain graph algorithm (LA), K nearest neighbors (KNN), and Page rank algorithms (PR) have been employed for predicting a total of 60 disease-diet associations, out of which 46 have been found to be either significant in disease risk prevention/mitigation or in its progression as validated using PubMed literature. A precision of 76.7% has been achieved which is significant considering the involvement of a novel disease like Covid-19. The generated interdependencies can be further explored by medical professionals and caregivers in order to plan healthy eating patterns for Covid-19 patients. The proposed approach can also be utilized for finding beneficial diets for different combinations of comorbidities with Covid-19 as per the underlying health conditions of a patient. Graphical abstract.
Collapse
Affiliation(s)
- Rashmeet Toor
- Cloud and IoT Research Lab, Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Inderveer Chana
- Cloud and IoT Research Lab, Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| |
Collapse
|
46
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
47
|
Zhao Y, Chen L, Chen L, Huang J, Chen S, Yu Z. Exploration of the Potential Relationship Between Gut Microbiota Remodeling Under the Influence of High-Protein Diet and Crohn's Disease. Front Microbiol 2022; 13:831176. [PMID: 35308389 PMCID: PMC8927681 DOI: 10.3389/fmicb.2022.831176] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 01/07/2022] [Indexed: 12/20/2022] Open
Abstract
Diet and gut microbiota are both important factors in the pathogenesis of Crohn’s disease, and changes in diet can lead to alteration in gut microbiome. However, there is still insufficient exploration on interaction within the gut microbiota under high-protein diet (HPD) intervention. We analyzed the gut microbial network and marker taxa from patients with Crohn’s disease in public database (GMrepo, https://gmrepo.humangut.info) combined with investigation of the changes of composition and function of intestinal microbiome in mice fed on HPD by metagenomic sequencing. The results showed that there was an indirect negative correlation between Escherichia coli and Lachnospiraceae in patients with Crohn’s disease, and Escherichia coli was a marker for both Crohn’s disease and HPD intervention. Besides, enriched HH_1414 (one of the orthologs in eggNOG) related to tryptophan metabolism was from Helicobacter, whereas reduced orthologs (OGs) mainly contributed by Lachnospiraceae after HPD intervention. Our research indicates that some compositional changes in gut microbiota after HPD intervention are consistent with those in patients with Crohn’s disease, providing insights into potential impact of altered gut microbes under HPD on Crohn’s disease.
Collapse
Affiliation(s)
- Yiming Zhao
- Department of Gastroenterology, Xiangya Hospital, Central South University, Changsha, China.,Department of Microbiology, School of Basic Medical Science, Central South University, Changsha, China
| | - Lulu Chen
- Department of Gastroenterology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Liyu Chen
- Department of Microbiology, School of Basic Medical Science, Central South University, Changsha, China
| | - Jing Huang
- Department of Parasitology, School of Basic Medical Science, Central South University, Changsha, China
| | - Shuijiao Chen
- Department of Gastroenterology, Xiangya Hospital, Central South University, Changsha, China.,National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Zheng Yu
- Department of Microbiology, School of Basic Medical Science, Central South University, Changsha, China
| |
Collapse
|
48
|
Huang YA, Huang ZA, Li JQ, You ZH, Wang L, Yi HC, Yu CQ. GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences. BMC Genomics 2022; 22:916. [PMID: 35296232 PMCID: PMC8925046 DOI: 10.1186/s12864-022-08423-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 02/25/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.
Collapse
Affiliation(s)
- Yu-An Huang
- Department of Information Engineering, Xijing University, Xi'an, 710123, China.
| | - Zhi-An Huang
- Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.
| | - Zhu-Hong You
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| | - Lei Wang
- Guangxi Academy of Science, Nanning, 530000, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830000, China
| | - Chang-Qing Yu
- Department of Information Engineering, Xijing University, Xi'an, 710123, China
| |
Collapse
|
49
|
Xu D, Xu H, Zhang Y, Gao R. Novel Collaborative Weighted Non-negative Matrix Factorization Improves Prediction of Disease-Associated Human Microbes. Front Microbiol 2022; 13:834982. [PMID: 35369503 PMCID: PMC8965656 DOI: 10.3389/fmicb.2022.834982] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 01/19/2022] [Indexed: 12/14/2022] Open
Abstract
Extensive clinical and biomedical studies have shown that microbiome plays a prominent role in human health. Identifying potential microbe–disease associations (MDAs) can help reveal the pathological mechanism of human diseases and be useful for the prevention, diagnosis, and treatment of human diseases. Therefore, it is necessary to develop effective computational models and reduce the cost and time of biological experiments. Here, we developed a novel machine learning-based joint framework called CWNMF-GLapRLS for human MDA prediction using the proposed collaborative weighted non-negative matrix factorization (CWNMF) technique and graph Laplacian regularized least squares. Especially, to fuse more similarity information, we calculated the functional similarity of microbes. To deal with missing values and effectively overcome the data sparsity problem, we proposed a collaborative weighted NMF technique to reconstruct the original association matrix. In addition, we developed a graph Laplacian regularized least-squares method for prediction. The experimental results of fivefold and leave-one-out cross-validation demonstrated that our method achieved the best performance by comparing it with 5 state-of-the-art methods on the benchmark dataset. Case studies further showed that the proposed method is an effective tool to predict potential MDAs and can provide more help for biomedical researchers.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
- *Correspondence: Yusen Zhang,
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, China
- Rui Gao,
| |
Collapse
|
50
|
Wang L, Li H, Wang Y, Tan Y, Chen Z, Pei T, Zou Q. MDADP: A webserver integrating database and prediction tools for microbe-disease associations. IEEE J Biomed Health Inform 2022; 26:3427-3434. [PMID: 35254998 DOI: 10.1109/jbhi.2022.3156166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
More and more evidence has demonstrated that microbiota play important roles in the life processes of the human body. In recent years, various computational methods have been proposed for identifying potentially disease-associated microbes to save costs in traditional biological experiments. However, prediction performances of these methods are generally limited by outdated and incomplete datasets. And moreover, until now, there are limited studies that can provide visual predictive tools for inferring possible microbe-disease associations (MDAs) as well. Hence, in this manuscript, a novel webserver called MDADP will be proposed to identify latent MDAs, in which, a new MDA database together with interactive prediction tools for MDAs studies will be designed simultaneously. Especially, in the newly constructed MDA database, 2019 known MDAs between 58 diseases and 703 microbes have been manually collected first. And then, through adopting the average ranking method and the co-confidence method respectively, eight representative computational models have been integrated together to identify potential disease-related microbes. As a result, MDADP can provide not only interactive features for users to access and capture MDAs entities, but also effective tools for users to identify candidate microbes for different diseases. To our knowledge, MDADP is the first online platform that incorporates a new MDA database with comprehensive MDA prediction tools. Therefore, we believe that it will be a valuable source of information for researches in microbiology and disease-related fields. MDADP can be accessed at http://mdadp.leelab2997.cn.
Collapse
|