1
|
Xuan P, Guan C, Chen S, Gu J, Wang X, Nakaguchi T, Zhang T. Gating-Enhanced Hierarchical Structure Learning in Hyperbolic Space and Multi-scale Neighbor Topology Learning in Euclidean Space for Prediction of Microbe-Drug Associations. J Chem Inf Model 2024. [PMID: 39324410 DOI: 10.1021/acs.jcim.4c01340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
Identifying drug-related microbes may help us explore how the microbes affect the functions of drugs by promoting or inhibiting their effects. Most previous methods for the prediction of microbe-drug associations focused on integrating the attributes and topologies of microbe and drug nodes in Euclidean space. The heterogeneous network composed of microbes and drugs has a hierarchical structure, and the hyperbolic space is helpful for reflecting the structure. However, the previous methods did not fully exploit the structure. We propose a multi-space feature learning enhanced microbe-drug association prediction method, MFLP, to fuse the hierarchical structure of microbe and drug nodes in hyperbolic space and the multiscale neighbor topologies in Euclidean space. First, we project the nodes of the microbe-drug heterogeneous network on the sphere in hyperbolic space and then construct a topology which implies hierarchical structure and forms a hierarchical attribute embedding. The node information from multiple types of neighbor nodes with the new topological structure in the tangent plane space of a sphere is aggregated by the designed gating-enhanced hyperbolic graph neural network. Second, the gate at the node feature level is constructed to adaptively fuse the hierarchical features of microbe and drug nodes from two adjacent graph neural encoding layers. Third, multiple neighbor topological embeddings for each microbe and drug node are formed by neighborhood random walks on the microbe-drug heterogeneous network, and they cover neighborhood topologies with multiple scales, respectively. Finally, as each scale of topological embedding contains its specific neighborhood topology, we establish an independent graph convolutional neural network for the topology and form the topological representations of microbe and drug nodes in Euclidean space. The comparison experiments based on cross validation showed that MFLP outperformed several advanced prediction methods, and the ablation experiments verified the effectiveness of MFLP's major innovations. The case studies on three drugs further demonstrated MFLP's ability in being applied to discover potential candidate microbes for the given drugs.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Chunhong Guan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Sentao Chen
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Xiuju Wang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
2
|
Kuang H, Liu X, Tan H, Zhang Z, Zeng B, Wang L. GLNNMDA: a multimodal prediction model for microbe-drug associations based on global and local features. Sci Rep 2024; 14:20847. [PMID: 39242712 PMCID: PMC11379827 DOI: 10.1038/s41598-024-71837-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024] Open
Abstract
Microbes have been demonstrated to be closely linked to diseases that pose a major threat to human health. Computing technologies can help researchers find potential microbe-drug associations more quickly and precisely. In this study, we introduced a novel computational prediction model called GLNNMDA based on global and local features of microbes and drugs to infer possible microbe-drug correlations. In GLNNMDA, we first constructed a heterogeneous network based on known microbe-drug relationships by integrating multiple similarity metrics of drugs and microbes. Subsequently, low-dimensional features will be extracted for nodes in the heterogeneous network by adopting the graph attention encoder. Next, based on combining these low-dimensional features with multiple properties of microbes and drugs to form a new comprehensive feature matrix, we would utilize the GLF module to extract the global and local features for microbes and drugs respectively, and then, we would further fuse these global and local features to come up with predictions of possible microbe-drug associations. Moreover, in order to evaluate the prediction performance of GLNNMDA, under the framework of fivefold cross-validation, intensive comparative experiments and case studies were done on different well-known public databases. The results showed that GLNNMDA obtained the highest AUC values as well as AUPR values of 0.9802 ± 0.0011, 0.9773 ± 0.0021 and 0.8586 ± 0.0004, 0.8008 ± 0.0031 in the two databases, MDAD and aBiofilm, respectively, compared to the state-of-the-art competing prediction methods. In addition, case studies of well-known microorganisms and drugs have demonstrated the effectiveness of GLNNMDA in inferring potential microbial drug associations, which implies that GLNNMDA may be a useful tool for microbe-drug association prediction in the future. The source code is available at: " https://github.com/KuangHaiYue/GLNNMDA.git ".
Collapse
Affiliation(s)
- Haiyue Kuang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Huilin Tan
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Bin Zeng
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
3
|
Xuan P, Xu Z, Cui H, Gu J, Liu C, Zhang T, Wu P. Dynamic category-sensitive hypergraph inferring and homo-heterogeneous neighbor feature learning for drug-related microbe prediction. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae562. [PMID: 39292557 PMCID: PMC11441325 DOI: 10.1093/bioinformatics/btae562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 08/29/2024] [Accepted: 09/17/2024] [Indexed: 09/20/2024]
Abstract
MOTIVATION The microbes in human body play a crucial role in influencing the functions of drugs, as they can regulate the activities and toxicities of drugs. Most recent methods for predicting drug-microbe associations are based on graph learning. However, the relationships among multiple drugs and microbes are complex, diverse, and heterogeneous. Existing methods often fail to fully model the relationships. In addition, the attributes of drug-microbe pairs exhibit long-distance spatial correlations, which previous methods have not integrated effectively. RESULTS We propose a new prediction method named DHDMP which is designed to encode the relationships among multiple drugs and microbes and integrate the attributes of various neighbor nodes along with the pairwise long-distance correlations. First, we construct a hypergraph with dynamic topology, where each hyperedge represents a specific relationship among multiple drug nodes and microbe nodes. Considering the heterogeneity of node attributes across different categories, we developed a node category-sensitive hypergraph convolution network to encode these diverse relationships. Second, we construct homogeneous graphs for drugs and microbes respectively, as well as drug-microbe heterogeneous graph, facilitating the integration of features from both homogeneous and heterogeneous neighbors of each target node. Third, we introduce a graph convolutional network with cross-graph feature propagation ability to transfer node features from homogeneous to heterogeneous graphs for enhanced neighbor feature representation learning. The propagation strategy aids in the deep fusion of features from both types of neighbors. Finally, we design spatial cross-attention to encode the attributes of drug-microbe pairs, revealing long-distance correlations among multiple pairwise attribute patches. The comprehensive comparison experiments showed our method outperformed state-of-the-art methods for drug-microbe association prediction. The ablation studies demonstrated the effectiveness of node category-sensitive hypergraph convolution network, graph convolutional network with cross-graph feature propagation, and spatial cross-attention. Case studies on three drugs further showed DHDMP's potential application in discovering the reliable candidate microbes for the interested drugs. AVAILABILITY AND IMPLEMENTATION Source codes and supplementary materials are available at https://github.com/pingxuan-hlju/DHDMP.
Collapse
Affiliation(s)
- Ping Xuan
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Zelong Xu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC 3083, Australia
- Australian Centre for AI in Medical Innovation, La Trobe University, Melbourne 3083, Australia
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Cheng Liu
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Peiliang Wu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| |
Collapse
|
4
|
Zhu H, Hao H, Yu L. Identification of microbe-disease signed associations via multi-scale variational graph autoencoder based on signed message propagation. BMC Biol 2024; 22:172. [PMID: 39148051 PMCID: PMC11328394 DOI: 10.1186/s12915-024-01968-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 08/01/2024] [Indexed: 08/17/2024] Open
Abstract
BACKGROUND Plenty of clinical and biomedical research has unequivocally highlighted the tremendous significance of the human microbiome in relation to human health. Identifying microbes associated with diseases is crucial for early disease diagnosis and advancing precision medicine. RESULTS Considering that the information about changes in microbial quantities under fine-grained disease states helps to enhance a comprehensive understanding of the overall data distribution, this study introduces MSignVGAE, a framework for predicting microbe-disease sign associations using signed message propagation. MSignVGAE employs a graph variational autoencoder to model noisy signed association data and extends the multi-scale concept to enhance representation capabilities. A novel strategy for propagating signed message in signed networks addresses heterogeneity and consistency among nodes connected by signed edges. Additionally, we utilize the idea of denoising autoencoder to handle the noise in similarity feature information, which helps overcome biases in the fused similarity data. MSignVGAE represents microbe-disease associations as a heterogeneous graph using similarity information as node features. The multi-class classifier XGBoost is utilized to predict sign associations between diseases and microbes. CONCLUSIONS MSignVGAE achieves AUROC and AUPR values of 0.9742 and 0.9601, respectively. Case studies on three diseases demonstrate that MSignVGAE can effectively capture a comprehensive distribution of associations by leveraging signed information.
Collapse
Affiliation(s)
- Huan Zhu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Hongxia Hao
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
5
|
Su S, Liu M, Zhou J, Zhang J. GCGACNN: A Graph Neural Network and Random Forest for Predicting Microbe-Drug Associations. Biomolecules 2024; 14:946. [PMID: 39199334 PMCID: PMC11353181 DOI: 10.3390/biom14080946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Revised: 07/22/2024] [Accepted: 08/01/2024] [Indexed: 09/01/2024] Open
Abstract
The interaction between microbes and drugs encompasses the sourcing of pharmaceutical compounds, microbial drug degradation, the development of drug resistance genes, and the impact of microbial communities on host drug metabolism and immune modulation. These interactions significantly impact drug efficacy and the evolution of drug resistance. In this study, we propose a novel predictive model, termed GCGACNN. We first collected microbe, disease, and drug association data from multiple databases and the relevant literature to construct three association matrices and generate similarity feature matrices using Gaussian similarity functions. These association and similarity feature matrices were then input into a multi-layer Graph Neural Network for feature extraction, followed by a two-dimensional Convolutional Neural Network for feature fusion, ultimately establishing an effective predictive framework. Experimental results demonstrate that GCGACNN outperforms existing methods in predictive performance.
Collapse
Affiliation(s)
- Shujuan Su
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China;
| | - Meiling Liu
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China;
| | - Jiyun Zhou
- Lieber Institute, Johns Hopkins University, Baltimore, MD 21218, USA;
| | - Jingfeng Zhang
- School of Computer Science, The University of Auckland, Auckland 1142, New Zealand;
| |
Collapse
|
6
|
Xie W, Yu J, Huang L, For LS, Zheng Z, Chen X, Wang Y, Liu Z, Peng C, Wong KC. DeepSeq2Drug: An expandable ensemble end-to-end anti-viral drug repurposing benchmark framework by multi-modal embeddings and transfer learning. Comput Biol Med 2024; 175:108487. [PMID: 38653064 DOI: 10.1016/j.compbiomed.2024.108487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 03/26/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024]
Abstract
Drug repurposing is promising in multiple scenarios, such as emerging viral outbreak controls and cost reductions of drug discovery. Traditional graph-based drug repurposing methods are limited to fast, large-scale virtual screens, as they constrain the counts for drugs and targets and fail to predict novel viruses or drugs. Moreover, though deep learning has been proposed for drug repurposing, only a few methods have been used, including a group of pre-trained deep learning models for embedding generation and transfer learning. Hence, we propose DeepSeq2Drug to tackle the shortcomings of previous methods. We leverage multi-modal embeddings and an ensemble strategy to complement the numbers of drugs and viruses and to guarantee the novel prediction. This framework (including the expanded version) involves four modal types: six NLP models, four CV models, four graph models, and two sequence models. In detail, we first make a pipeline and calculate the predictive performance of each pair of viral and drug embeddings. Then, we select the best embedding pairs and apply an ensemble strategy to conduct anti-viral drug repurposing. To validate the effect of the proposed ensemble model, a monkeypox virus (MPV) case study is conducted to reflect the potential predictive capability. This framework could be a benchmark method for further pre-trained deep learning optimization and anti-viral drug repurposing tasks. We also build software further to make the proposed model easier to reuse. The code and software are freely available at http://deepseq2drug.cs.cityu.edu.hk.
Collapse
Affiliation(s)
- Weidun Xie
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Jixiang Yu
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Lei Huang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Lek Shyuen For
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Xingjian Chen
- Cutaneous Biology Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Yuchen Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
| | - Zhichao Liu
- Sir William Dunn School of Pathology, University of Oxford, UK
| | - Chengbin Peng
- College of Information Science and Engineering, Ningbo University, Ningbo, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China; Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China; Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
| |
Collapse
|
7
|
Wu Z, Li S, Luo L, Ding P. HKFGCN: A novel multiple kernel fusion framework on graph convolutional network to predict microbe-drug associations. Comput Biol Chem 2024; 110:108041. [PMID: 38471354 DOI: 10.1016/j.compbiolchem.2024.108041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/29/2023] [Accepted: 02/27/2024] [Indexed: 03/14/2024]
Abstract
Accumulating clinical studies have consistently demonstrated that the microbes in the human body closely interact with the human host, actively participating in the regulation of drug effectiveness. Identifying the associations between microbes and drugs can facilitate the development of drug discovery, and microbes have become a new target in antimicrobial drug development. However, the discovery of microbe-drug associations relies on clinical or biological experiments, which are not only time-consuming but also financially burdensome. Thus, the utilization of computational methods to predict microbe-drug associations holds promise for reducing costs and enhancing the efficiency of biological experiments. Here, we introduce a new computational method, called HKFGCN (Heterogeneous information Kernel Fusion Graph Convolution Network), to predict the microbe-drug associations. Instead of extracting feature from a single network in previous studies, HKFGCN separately extracts topological information features from different networks, and further refines them by generating Gaussian kernel features. HKFGCN consists of three main steps. Firstly, we constructed two similarity networks and a microbe-drug association network based on numerous biological data. Second, we employed two types of encoders to extract features from these networks. Next, Gaussian kernel features were obtained from the drug and microbe features at each layer. Finally, we reconstructed the bipartite microbe-drug graph based on the learned representations. Experimental results demonstrate the excellent performance of the HKFGCN model across different datasets using the cross-validation scheme. Additionally, we conduced case studies on human immunodeficiency virus, and the results were corroborated by existing literatures. The prediction model's code is available at https://github.com/roll-of-bubble/HKFGCN.
Collapse
Affiliation(s)
- Ziyu Wu
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Shasha Li
- Department of Electrical and Electronic Engineering, University of Hong Kong, 999077, Hong Kong, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China; Hunan Medical Big Data International Sci.&Tech. Innovation Cooperation Base, Hengyang, Hunan 421000, China.
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China.
| |
Collapse
|
8
|
Yang Z, Wang L, Zhang X, Zeng B, Zhang Z, Liu X. LCASPMDA: a computational model for predicting potential microbe-drug associations based on learnable graph convolutional attention networks and self-paced iterative sampling ensemble. Front Microbiol 2024; 15:1366272. [PMID: 38846568 PMCID: PMC11153849 DOI: 10.3389/fmicb.2024.1366272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 05/06/2024] [Indexed: 06/09/2024] Open
Abstract
Introduction Numerous studies show that microbes in the human body are very closely linked to the human host and can affect the human host by modulating the efficacy and toxicity of drugs. However, discovering potential microbe-drug associations through traditional wet labs is expensive and time-consuming, hence, it is important and necessary to develop effective computational models to detect possible microbe-drug associations. Methods In this manuscript, we proposed a new prediction model named LCASPMDA by combining the learnable graph convolutional attention network and the self-paced iterative sampling ensemble strategy to infer latent microbe-drug associations. In LCASPMDA, we first constructed a heterogeneous network based on newly downloaded known microbe-drug associations. Then, we adopted the learnable graph convolutional attention network to learn the hidden features of nodes in the heterogeneous network. After that, we utilized the self-paced iterative sampling ensemble strategy to select the most informative negative samples to train the Multi-Layer Perceptron classifier and put the newly-extracted hidden features into the trained MLP classifier to infer possible microbe-drug associations. Results and discussion Intensive experimental results on two different public databases including the MDAD and the aBiofilm showed that LCASPMDA could achieve better performance than state-of-the-art baseline methods in microbe-drug association prediction.
Collapse
Affiliation(s)
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| | | | | | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| |
Collapse
|
9
|
Zhao J, Kuang L, Hu A, Zhang Q, Yang D, Wang C. OGNNMDA: a computational model for microbe-drug association prediction based on ordered message-passing graph neural networks. Front Genet 2024; 15:1370013. [PMID: 38689654 PMCID: PMC11058190 DOI: 10.3389/fgene.2024.1370013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 03/14/2024] [Indexed: 05/02/2024] Open
Abstract
In recent years, many excellent computational models have emerged in microbe-drug association prediction, but their performance still has room for improvement. This paper proposed the OGNNMDA framework, which applied an ordered message-passing mechanism to distinguish the different neighbor information in each message propagation layer, and it achieved a better embedding ability through deeper network layers. Firstly, the method calculates four similarity matrices based on microbe functional similarity, drug chemical structure similarity, and their respective Gaussian interaction profile kernel similarity. After integrating these similarity matrices, it concatenates the integrated similarity matrix with the known association matrix to obtain the microbe-drug heterogeneous matrix. Secondly, it uses a multi-layer ordered message-passing graph neural network encoder to encode the heterogeneous network and the known association information adjacency matrix, thereby obtaining the final embedding features of the microbe-drugs. Finally, it inputs the embedding features into the bilinear decoder to get the final prediction results. The OGNNMDA method performed comparative experiments, ablation experiments, and case studies on the aBiofilm, MDAD and DrugVirus datasets using 5-fold cross-validation. The experimental results showed that OGNNMDA showed the strongest prediction performance on aBiofilm and MDAD and obtained sub-optimal results on DrugVirus. In addition, the case studies on well-known drugs and microbes also support the effectiveness of the OGNNMDA method. Source codes and data are available at: https://github.com/yyzg/OGNNMDA.
Collapse
Affiliation(s)
- Jiabao Zhao
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - An Hu
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Qi Zhang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Dinghai Yang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Chunxiang Wang
- Hunan Institute of Engineering College of textile and clothing, Xiangtan, China
| |
Collapse
|
10
|
Kuang H, Zhang Z, Zeng B, Liu X, Zuo H, Xu X, Wang L. A novel microbe-drug association prediction model based on graph attention networks and bilayer random forest. BMC Bioinformatics 2024; 25:78. [PMID: 38378437 PMCID: PMC10877932 DOI: 10.1186/s12859-024-05687-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 01/31/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND In recent years, the extensive use of drugs and antibiotics has led to increasing microbial resistance. Therefore, it becomes crucial to explore deep connections between drugs and microbes. However, traditional biological experiments are very expensive and time-consuming. Therefore, it is meaningful to develop efficient computational models to forecast potential microbe-drug associations. RESULTS In this manuscript, we proposed a novel prediction model called GARFMDA by combining graph attention networks and bilayer random forest to infer probable microbe-drug correlations. In GARFMDA, through integrating different microbe-drug-disease correlation indices, we constructed two different microbe-drug networks first. And then, based on multiple measures of similarity, we constructed a unique feature matrix for drugs and microbes respectively. Next, we fed these newly-obtained microbe-drug networks together with feature matrices into the graph attention network to extract the low-dimensional feature representations for drugs and microbes separately. Thereafter, these low-dimensional feature representations, along with the feature matrices, would be further inputted into the first layer of the Bilayer random forest model to obtain the contribution values of all features. And then, after removing features with low contribution values, these contribution values would be fed into the second layer of the Bilayer random forest to detect potential links between microbes and drugs. CONCLUSIONS Experimental results and case studies show that GARFMDA can achieve better prediction performance than state-of-the-art approaches, which means that GARFMDA may be a useful tool in the field of microbe-drug association prediction in the future. Besides, the source code of GARFMDA is available at https://github.com/KuangHaiYue/GARFMDA.git.
Collapse
Affiliation(s)
- Haiyue Kuang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Bin Zeng
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Hao Zuo
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Xingye Xu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
11
|
Zhu B, Yu HY, Du BX, Shi JY. DMGL-MDA: A dual-modal graph learning method for microbe-drug association prediction. Methods 2024; 222:51-56. [PMID: 38184219 DOI: 10.1016/j.ymeth.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 12/26/2023] [Accepted: 12/28/2023] [Indexed: 01/08/2024] Open
Abstract
The interaction between human microbes and drugs can significantly impact human physiological functions. It is crucial to identify potential microbe-drug associations (MDAs) before drug administration. However, conventional biological experiments to predict MDAs are plagued by drawbacks such as time-consuming, high costs, and potential risks. On the contrary, computational approaches can speed up the screening of MDAs at a low cost. Most computational models usually use a drug similarity matrix as the initial feature representation of drugs and stack the graph neural network layers to extract the features of network nodes. However, different calculation methods result in distinct similarity matrices, and message passing in graph neural networks (GNNs) induces phenomena of over-smoothing and over-squashing, thereby impacting the performance of the model. To address these issues, we proposed a novel graph representation learning model, dual-modal graph learning for microbe-drug association prediction (DMGL-MDA). It comprises a dual-modal embedding module, a bipartite graph network embedding module, and a predictor module. To assess the performance of DMGL-MDA, we compared it against state-of-the-art methods using two benchmark datasets. Through cross-validation, we illustrated the superiority of DMGL-MDA. Furthermore, we conducted ablation experiments and case studies to validate the effective performance of the model.
Collapse
Affiliation(s)
- Bei Zhu
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Hao-Yang Yu
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Bing-Xue Du
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jian-Yu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China.
| |
Collapse
|
12
|
Tan H, Zhang Z, Liu X, Chen Y, Yang Z, Wang L. MDSVDNV: predicting microbe-drug associations by singular value decomposition and Node2vec. Front Microbiol 2024; 14:1303585. [PMID: 38260900 PMCID: PMC10800927 DOI: 10.3389/fmicb.2023.1303585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
Introduction Recent researches have demonstrated that microbes are crucial for the growth and development of the human body, the movement of nutrients, and human health. Diseases may arise as a result of disruptions and imbalances in the microbiome. The pathological investigation of associated diseases and the advancement of clinical medicine can both benefit from the identification of drug-associated microbes. Methods In this article, we proposed a new prediction model called MDSVDNV to infer potential microbe-drug associations, in which the Node2vec network embedding approach and the singular value decomposition (SVD) matrix decomposition method were first adopted to produce linear and non-linear representations of microbe interactions. Results and discussion Compared with state-of-the-art competitive methods, intensive experimental results demonstrated that MDSVDNV could achieve the best AUC value of 98.51% under a 5-fold CV, which indicated that MDSVDNV outperformed existing competing models and may be an effective method for discovering latent microbe-drug associations in the future.
Collapse
Affiliation(s)
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| | | | | | | | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| |
Collapse
|
13
|
Zhu H, Hao H, Yu L. Identifying disease-related microbes based on multi-scale variational graph autoencoder embedding Wasserstein distance. BMC Biol 2023; 21:294. [PMID: 38115088 PMCID: PMC10731776 DOI: 10.1186/s12915-023-01796-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND Enormous clinical and biomedical researches have demonstrated that microbes are crucial to human health. Identifying associations between microbes and diseases can not only reveal potential disease mechanisms, but also facilitate early diagnosis and promote precision medicine. Due to the data perturbation and unsatisfactory latent representation, there is a significant room for improvement. RESULTS In this work, we proposed a novel framework, Multi-scale Variational Graph AutoEncoder embedding Wasserstein distance (MVGAEW) to predict disease-related microbes, which had the ability to resist data perturbation and effectively generate latent representations for both microbes and diseases from the perspective of distribution. First, we calculated multiple similarities and integrated them through similarity network confusion. Subsequently, we obtained node latent representations by improved variational graph autoencoder. Ultimately, XGBoost classifier was employed to predict potential disease-related microbes. We also introduced multi-order node embedding reconstruction to enhance the representation capacity. We also performed ablation studies to evaluate the contribution of each section of our model. Moreover, we conducted experiments on common drugs and case studies, including Alzheimer's disease, Crohn's disease, and colorectal neoplasms, to validate the effectiveness of our framework. CONCLUSIONS Significantly, our model exceeded other currently state-of-the-art methods, exhibiting a great improvement on the HMDAD database.
Collapse
Affiliation(s)
- Huan Zhu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Hongxia Hao
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
14
|
Zhang L, Ouyang C, Liu Y, Liao Y, Gao Z. Multimodal contrastive representation learning for drug-target binding affinity prediction. Methods 2023; 220:126-133. [PMID: 37952703 DOI: 10.1016/j.ymeth.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/28/2023] [Accepted: 11/06/2023] [Indexed: 11/14/2023] Open
Abstract
In the biomedical field, the efficacy of most drugs is demonstrated by their interactions with targets, meanwhile, accurate prediction of the strength of drug-target binding is extremely important for drug development efforts. Traditional bioassay-based drug-target binding affinity (DTA) prediction methods cannot meet the needs of drug R&D in the era of big data. Recent years we have witnessed significant success on deep learning-based models for drug-target binding affinity prediction task. However, these models only considered a single modality of drug and target information, and some valuable information was not fully utilized. In fact, the information of different modalities of drug and target can complement each other, and more valuable information can be obtained by fusing the information of different modalities. In this paper, we introduce a multimodal information fusion model for DTA prediction that is called FMDTA, which fully considers drug/target information in both string and graph modalities and balances the feature representations of different modalities by a contrastive learning approach. In addition, we exploited the alignment information of drug atoms and target residues to capture the positional information of string patterns, which can extract more useful feature information in SMILES and target sequences. Experimental results on two benchmark datasets show that FMDTA outperforms the state-of-the-art model, demonstrating the feasibility and excellent feature capture capability of FMDTA. The code of FMDTA and the data are available at: https://github.com/bestdoubleLin/FMDTA.
Collapse
Affiliation(s)
- Linlin Zhang
- School of Computer, University of South China, Hengyang, China
| | - Chunping Ouyang
- School of Computer, University of South China, Hengyang, China.
| | - Yongbin Liu
- School of Computer, University of South China, Hengyang, China
| | - Yiming Liao
- The Second Affiliated Hospital, Hengyang Medical School, University of South China, Hengyang, China
| | - Zheng Gao
- Department of Information and Library Science, Indiana University Bloomington, Bloomington, United States
| |
Collapse
|
15
|
Zhou Z, Zhuo L, Fu X, Zou Q. Joint deep autoencoder and subgraph augmentation for inferring microbial responses to drugs. Brief Bioinform 2023; 25:bbad483. [PMID: 38171927 PMCID: PMC10764208 DOI: 10.1093/bib/bbad483] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 10/25/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
Exploring microbial stress responses to drugs is crucial for the advancement of new therapeutic methods. While current artificial intelligence methodologies have expedited our understanding of potential microbial responses to drugs, the models are constrained by the imprecise representation of microbes and drugs. To this end, we combine deep autoencoder and subgraph augmentation technology for the first time to propose a model called JDASA-MRD, which can identify the potential indistinguishable responses of microbes to drugs. In the JDASA-MRD model, we begin by feeding the established similarity matrices of microbe and drug into the deep autoencoder, enabling to extract robust initial features of both microbes and drugs. Subsequently, we employ the MinHash and HyperLogLog algorithms to account intersections and cardinality data between microbe and drug subgraphs, thus deeply extracting the multi-hop neighborhood information of nodes. Finally, by integrating the initial node features with subgraph topological information, we leverage graph neural network technology to predict the microbes' responses to drugs, offering a more effective solution to the 'over-smoothing' challenge. Comparative analyses on multiple public datasets confirm that the JDASA-MRD model's performance surpasses that of current state-of-the-art models. This research aims to offer a more profound insight into the adaptability of microbes to drugs and to furnish pivotal guidance for drug treatment strategies. Our data and code are publicly available at: https://github.com/ZZCrazy00/JDASA-MRD.
Collapse
Affiliation(s)
- Zhecheng Zhou
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000, Wenzhou, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000, Wenzhou, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, 410012, Changsha, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 611730, Chengdu, China
| |
Collapse
|
16
|
Fan L, Wang L, Zhu X. A novel microbe-drug association prediction model based on stacked autoencoder with multi-head attention mechanism. Sci Rep 2023; 13:7396. [PMID: 37149692 PMCID: PMC10164153 DOI: 10.1038/s41598-023-34438-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 04/29/2023] [Indexed: 05/08/2023] Open
Abstract
Microbes are intimately tied to the occurrence of various diseases that cause serious hazards to human health, and play an essential role in drug discovery, clinical application, and drug quality control. In this manuscript, we put forward a novel prediction model named MDASAE based on a stacked autoencoder (SAE) with multi-head attention mechanism to infer potential microbe-drug associations. In MDASAE, we first constructed three kinds of microbe-related and drug-related similarity matrices based on known microbe-disease-drug associations respectively. And then, we fed two kinds of microbe-related and drug-related similarity matrices respectively into the SAE to learn node attribute features, and introduced a multi-head attention mechanism into the output layer of the SAE to enhance feature extraction. Thereafter, we further adopted the remaining microbe and drug similarity matrices to derive inter-node features by using the Restart Random Walk algorithm. After that, the node attribute features and inter-node features of microbes and drugs would be fused together to predict scores of possible associations between microbes and drugs. Finally, intensive comparison experiments and case studies based on different well-known public databases under 5-fold cross-validation and 10-fold cross-validation respectively, proved that MDASAE can effectively predict the potential microbe-drug associations.
Collapse
Affiliation(s)
- Liu Fan
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421010, China
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China.
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421010, China.
| |
Collapse
|
17
|
Qu Q, Chen X, Ning B, Zhang X, Nie H, Zeng L, Chen H, Fu X. Prediction of miRNA-disease associations by neural network-based deep matrix factorization. Methods 2023; 212:1-9. [PMID: 36813017 DOI: 10.1016/j.ymeth.2023.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/17/2023] [Accepted: 02/10/2023] [Indexed: 02/23/2023] Open
Abstract
MicroRNA(miRNA) is a class of short non-coding RNAs with a length of about 22 nucleotides, which participates in various biological processes of cells. A number of studies have shown that miRNAs are closely related to the occurrence of cancer and various human diseases. Therefore, studying miRNA-disease associations is helpful to understand the pathogenesis of diseases as well as the prevention, diagnosis, treatment and prognosis of diseases. Traditional biological experimental methods for studying miRNA-disease associations have disadvantages such as expensive equipment, time-consuming and labor-intensive. With the rapid development of bioinformatics, more and more researchers are committed to developing effective computational methods to predict miRNA-disease associations in roder to reduce the time and money cost of experiments. In this study, we proposed a neural network-based deep matrix factorization method named NNDMF to predict miRNA-disease associations. To address the problem that traditional matrix factorization methods can only extract linear features, NNDMF used neural network to perform deep matrix factorization to extract nonlinear features, which makes up for the shortcomings of traditional matrix factorization methods. We compared NNDMF with four previous classical prediction models (IMCMDA, GRMDA, SACMDA and ICFMDA) in global LOOCV and local LOOCV, respectively. The AUCs achieved by NNDMF in two cross-validation methods were 0.9340 and 0.8763, respectively. Furthermore, we conducted case studies on three important human diseases (lymphoma, colorectal cancer and lung cancer) to validate the effectiveness of NNDMF. In conclusion, NNDMF could effectively predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Qiang Qu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xia Chen
- School of Basic Education, Changsha Aeronautical Vocational and Technical College, Changsha, China
| | - Bin Ning
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiang Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Hao Nie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Li Zeng
- College of Life and Environmental Science, Hunan University of Art and Science, Changde, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing, China.
| |
Collapse
|
18
|
Li H, Hou ZJ, Zhang WG, Qu J, Yao HB, Chen Y. Prediction of potential drug-microbe associations based on matrix factorization and a three-layer heterogeneous network. Comput Biol Chem 2023; 104:107857. [PMID: 37018909 DOI: 10.1016/j.compbiolchem.2023.107857] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 02/27/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023]
Abstract
Microbes in the human body are closely linked to many complex human diseases and are emerging as new drug targets. These microbes play a crucial role in drug development and disease treatment. Traditional methods of biological experiments are not only time-consuming but also costly. Using computational methods to predict microbe-drug associations can effectively complement biological experiments. In this experiment, we constructed heterogeneity networks for drugs, microbes, and diseases using multiple biomedical data sources. Then, we developed a model with matrix factorization and a three-layer heterogeneous network (MFTLHNMDA) to predict potential drug-microbe associations. The probability of microbe-drug association was obtained by a global network-based update algorithm. Finally, the performance of MFTLHNMDA was evaluated in the framework of leave-one-out cross-validation (LOOCV) and 5-fold cross-validation (5-fold CV). The results showed that our model performed better than six state-of-the-art methods that had AUC of 0.9396 and 0.9385 + /- 0.0000, respectively. This case study further confirms the effectiveness of MFTLHNMDA in identifying potential drug-microbe associations and new drug-microbe associations.
Collapse
|
19
|
Tian Z, Yu Y, Fang H, Xie W, Guo M. Predicting microbe-drug associations with structure-enhanced contrastive learning and self-paced negative sampling strategy. Brief Bioinform 2023; 24:7009077. [PMID: 36715986 DOI: 10.1093/bib/bbac634] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/19/2022] [Accepted: 12/29/2022] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION Predicting the associations between human microbes and drugs (MDAs) is one critical step in drug development and precision medicine areas. Since discovering these associations through wet experiments is time-consuming and labor-intensive, computational methods have already been an effective way to tackle this problem. Recently, graph contrastive learning (GCL) approaches have shown great advantages in learning the embeddings of nodes from heterogeneous biological graphs (HBGs). However, most GCL-based approaches don't fully capture the rich structure information in HBGs. Besides, fewer MDA prediction methods could screen out the most informative negative samples for effectively training the classifier. Therefore, it still needs to improve the accuracy of MDA predictions. RESULTS In this study, we propose a novel approach that employs the Structure-enhanced Contrastive learning and Self-paced negative sampling strategy for Microbe-Drug Association predictions (SCSMDA). Firstly, SCSMDA constructs the similarity networks of microbes and drugs, as well as their different meta-path-induced networks. Then SCSMDA employs the representations of microbes and drugs learned from meta-path-induced networks to enhance their embeddings learned from the similarity networks by the contrastive learning strategy. After that, we adopt the self-paced negative sampling strategy to select the most informative negative samples to train the MLP classifier. Lastly, SCSMDA predicts the potential microbe-drug associations with the trained MLP classifier. The embeddings of microbes and drugs learning from the similarity networks are enhanced with the contrastive learning strategy, which could obtain their discriminative representations. Extensive results on three public datasets indicate that SCSMDA significantly outperforms other baseline methods on the MDA prediction task. Case studies for two common drugs could further demonstrate the effectiveness of SCSMDA in finding novel MDA associations. AVAILABILITY The source code is publicly available on GitHub https://github.com/Yue-Yuu/SCSMDA-master.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Yue Yu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Haichuan Fang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Weixin Xie
- Institute of Intelligent System and Bioinformatics, College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150000, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, 100044, Beijing, China
| |
Collapse
|
20
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
21
|
Zhu Y, Zhang F, Zhang S, Yi M. Predicting latent lncRNA and cancer metastatic event associations via variational graph auto-encoder. Methods 2023; 211:1-9. [PMID: 36709790 DOI: 10.1016/j.ymeth.2023.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/05/2022] [Accepted: 01/20/2023] [Indexed: 01/27/2023] Open
Abstract
Long non-coding RNA (lncRNA) are shown to be closely associated with cancer metastatic events (CME, e.g., cancer cell invasion, intravasation, extravasation, proliferation) that collaboratively accelerate malignant cancer spread and cause high mortality rate in patients. Clinical trials may accurately uncover the relationships between lncRNAs and CMEs; however, it is time-consuming and expensive. With the accumulation of data, there is an urgent need to find efficient ways to identify these relationships. Herein, a graph embedding representation-based predictor (VGEA-LCME) for exploring latent lncRNA-CME associations is introduced. In VGEA-LCME, a heterogeneous combined network is constructed by integrating similarity and linkage matrix that can maintain internal and external characteristics of networks, and a variational graph auto-encoder serves as a feature generator to represent arbitrary lncRNA and CME pair. The final robustness predicted result is obtained by ensemble classifier strategy via cross-validation. Experimental comparisons and literature verification show better remarkable performance of VGEA-LCME, although the similarities between CMEs are challenging to calculate. In addition, VGEA-LCME can further identify organ-specific CMEs. To the best of our knowledge, this is the first computational attempt to discover the potential relationships between lncRNAs and CMEs. It may provide support and new insight for guiding experimental research of metastatic cancers. The source code and data are available at https://github.com/zhuyuan-cug/VGAE-LCME.
Collapse
Affiliation(s)
- Yuan Zhu
- School of Automation, China University of Geosciences, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China; Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China; Engineering Research Center of Intelligent Technology for Geo-Exploration, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China
| | - Feng Zhang
- School of Mathematics and Physics, China University of Geosciences, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China
| | - Shihua Zhang
- College of Life Science and Health, Wuhan University of Science and Technology, 974 Heping Avenue, Qingshan District, 430081, Wuhan, Hubei, China.
| | - Ming Yi
- School of Mathematics and Physics, China University of Geosciences, 388 Lumo Road, Hongshan District, 430074, Wuhan, Hubei, China.
| |
Collapse
|
22
|
Liu J, Kuang Z, Deng L. GCNPCA: miRNA-Disease Associations Prediction Algorithm Based on Graph Convolutional Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1041-1052. [PMID: 36049014 DOI: 10.1109/tcbb.2022.3203564] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
A growing number of studies have confirmed the important role of microRNAs (miRNAs) in human diseases and the aberrant expression of miRNAs affects the onset and progression of human diseases. The discovery of disease-associated miRNAs as new biomarkers promote the progress of disease pathology and clinical medicine. However, only a small proportion of miRNA-disease correlations have been validated by biological experiments. And identifying miRNA-disease associations through biological experiments is both expensive and inefficient. Therefore, it is important to develop efficient and highly accurate computational methods to predict miRNA-disease associations. A miRNA-disease associations prediction algorithm based on Graph Convolutional neural Networks and Principal Component Analysis (GCNPCA) is proposed in this paper. Specifically, the deep topological structure information is extracted from the heterogeneous network composed of miRNA and disease nodes by a Graph Convolutional neural Network (GCN) with an additional attention mechanism. The internal attribute information of the nodes is obtained by the Principal Component Analysis (PCA). Then, the topological structure information and the node attribute information are combined to construct comprehensive feature descriptors. Finally, the Random Forest (RF) is used to train and classify these feature descriptors. In the five-fold cross-validation experiment, the AUC and AUPR for the GCNPCA algorithm are 0.983 and 0.988 respectively.
Collapse
|
23
|
GACNNMDA: a computational model for predicting potential human microbe-drug associations based on graph attention network and CNN-based classifier. BMC Bioinformatics 2023; 24:35. [PMID: 36732704 PMCID: PMC9893988 DOI: 10.1186/s12859-023-05158-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 01/24/2023] [Indexed: 02/04/2023] Open
Abstract
As new drug targets, human microbes are proven to be closely related to human health. Effective computational methods for inferring potential microbe-drug associations can provide a useful complement to conventional experimental methods and will facilitate drug research and development. However, it is still a challenging work to predict potential interactions for new microbes or new drugs, since the number of known microbe-drug associations is very limited at present. In this manuscript, we first constructed two heterogeneous microbe-drug networks based on multiple measures of similarity of microbes and drugs, and known microbe-drug associations or known microbe-disease-drug associations, respectively. And then, we established two feature matrices for microbes and drugs through concatenating various attributes of microbes and drugs. Thereafter, after taking these two feature matrices and two heterogeneous microbe-drug networks as inputs of a two-layer graph attention network, we obtained low dimensional feature representations for microbes and drugs separately. Finally, through integrating low dimensional feature representations with two feature matrices to form the inputs of a convolutional neural network respectively, a novel computational model named GACNNMDA was designed to predict possible scores of microbe-drug pairs. Experimental results show that the predictive performance of GACNNMDA is superior to existing advanced methods. Furthermore, case studies on well-known microbes and drugs demonstrate the effectiveness of GACNNMDA as well. Source codes and supplementary materials are available at: https://github.com/tyqGitHub/TYQ/tree/master/GACNNMDA.
Collapse
|
24
|
Su W, Deng S, Gu Z, Yang K, Ding H, Chen H, Zhang Z. Prediction of apoptosis protein subcellular location based on amphiphilic pseudo amino acid composition. Front Genet 2023; 14:1157021. [PMID: 36926588 PMCID: PMC10011625 DOI: 10.3389/fgene.2023.1157021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
Introduction: Apoptosis proteins play an important role in the process of cell apoptosis, which makes the rate of cell proliferation and death reach a relative balance. The function of apoptosis protein is closely related to its subcellular location, it is of great significance to study the subcellular locations of apoptosis proteins. Many efforts in bioinformatics research have been aimed at predicting their subcellular location. However, the subcellular localization of apoptotic proteins needs to be carefully studied. Methods: In this paper, based on amphiphilic pseudo amino acid composition and support vector machine algorithm, a new method was proposed for the prediction of apoptosis proteins\x{2019} subcellular location. Results and Discussion: The method achieved good performance on three data sets. The Jackknife test accuracy of the three data sets reached 90.5%, 93.9% and 84.0%, respectively. Compared with previous methods, the prediction accuracies of APACC_SVM were improved.
Collapse
Affiliation(s)
- Wenxia Su
- College of Science, Inner Mongolia Agriculture University, Hohhot, China
| | - Shuyi Deng
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhifeng Gu
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Keli Yang
- Nonlinear Research Institute, Baoji University of Arts and Sciences, Baoji, China
| | - Hui Ding
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Chen
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| | - Zhaoyue Zhang
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China.,School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| |
Collapse
|
25
|
Tan Y, Zou J, Kuang L, Wang X, Zeng B, Zhang Z, Wang L. GSAMDA: a computational model for predicting potential microbe-drug associations based on graph attention network and sparse autoencoder. BMC Bioinformatics 2022; 23:492. [PMID: 36401174 PMCID: PMC9673879 DOI: 10.1186/s12859-022-05053-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 11/14/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Clinical studies show that microorganisms are closely related to human health, and the discovery of potential associations between microbes and drugs will facilitate drug research and development. However, at present, few computational methods for predicting microbe-drug associations have been proposed. RESULTS In this work, we proposed a novel computational model named GSAMDA based on the graph attention network and sparse autoencoder to infer latent microbe-drug associations. In GSAMDA, we first built a heterogeneous network through integrating known microbe-drug associations, microbe similarities and drug similarities. And then, we adopted a GAT-based autoencoder and a sparse autoencoder module respectively to learn topological representations and attribute representations for nodes in the newly constructed heterogeneous network. Finally, based on these two kinds of node representations, we constructed two kinds of feature matrices for microbes and drugs separately, and then, utilized them to calculate possible association scores for microbe-drug pairs. CONCLUSION A novel computational model is proposed for predicting potential microbe-drug associations based on graph attention network and sparse autoencoder. Compared with other five state-of-the-art competitive methods, the experimental results illustrated that our model can achieve better performance. Moreover, case studies on two categories of representative drugs and microbes further demonstrated the effectiveness of our model as well.
Collapse
Affiliation(s)
- Yaqin Tan
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, China
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China
| | - Juan Zou
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, China
| | - Xiangyi Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Bin Zeng
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, China.
- Institute of Bioinformatics Complex Network Big Data, Changsha University, Changsha, 410022, China.
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
26
|
He J, Xiao P, Chen C, Zhu Z, Zhang J, Deng L. GCNCMI: A Graph Convolutional Neural Network Approach for Predicting circRNA-miRNA Interactions. Front Genet 2022; 13:959701. [PMID: 35991563 PMCID: PMC9389118 DOI: 10.3389/fgene.2022.959701] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/23/2022] [Indexed: 11/18/2022] Open
Abstract
The interactions between circular RNAs (circRNAs) and microRNAs (miRNAs) have been shown to alter gene expression and regulate genes on diseases. Since traditional experimental methods are time-consuming and labor-intensive, most circRNA-miRNA interactions remain largely unknown. Developing computational approaches to large-scale explore the interactions between circRNAs and miRNAs can help bridge this gap. In this paper, we proposed a graph convolutional neural network-based approach named GCNCMI to predict the potential interactions between circRNAs and miRNAs. GCNCMI first mines the potential interactions of adjacent nodes in the graph convolutional neural network and then recursively propagates interaction information on the graph convolutional layers. Finally, it unites the embedded representations generated by each layer to make the final prediction. In the five-fold cross-validation, GCNCMI achieved the highest AUC of 0.9312 and the highest AUPR of 0.9412. In addition, the case studies of two miRNAs, hsa-miR-622 and hsa-miR-149-5p, showed that our model has a good effect on predicting circRNA-miRNA interactions. The code and data are available at https://github.com/csuhjhjhj/GCNCMI.
Collapse
Affiliation(s)
- Jie He
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Pei Xiao
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Chunyu Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zeqin Zhu
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical Engineering, University of California, San Diego, San Diego, CA, United States
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
- *Correspondence: Lei Deng,
| |
Collapse
|
27
|
Zheng J, Qian Y, He J, Kang Z, Deng L. Graph Neural Network with Self-Supervised Learning for Noncoding RNA-Drug Resistance Association Prediction. J Chem Inf Model 2022; 62:3676-3684. [PMID: 35838124 DOI: 10.1021/acs.jcim.2c00367] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Noncoding RNA(ncRNA) is closely related to drug resistance. Identifying the association between ncRNA and drug resistance is of great significance for drug development. Methods based on biological experiments are often time-consuming and small-scale. Therefore, developing computational methods to distinguish the association between ncRNA and drug resistance is urgent. We develop a computational framework called GSLRDA to predict the association between ncRNA and drug resistance in this work. First, the known ncRNA-drug resistance associations are modeled as a bipartite graph of ncRNA and drug. Then, GSLRDA uses the light graph convolutional network (lightGCN) to learn the vector representation of ncRNA and drug from the ncRNA-drug bipartite graph. In addition, GSLRDA uses different data augmentation methods to generate different views for ncRNA and drug nodes and performs self-supervised learning, further improving the quality of learned ncRNA and drug vector representations through contrastive learning between nodes. Finally, GSLRDA uses the inner product to predict the association between ncRNA and drug resistance. To the best of our knowledge, GSLRDA is the first to apply self-supervised learning in association prediction tasks in the field of bioinformatics. The experimental results show that GSLRDA takes an AUC value of 0.9101, higher than the other eight state-of-the-art models. In addition, case studies including two drugs further illustrate the effectiveness of GSLRDA in predicting the association between ncRNA and drug resistance. The code and data sets of GSLRDA are available at https://github.com/JJZ-code/GSLRDA.
Collapse
Affiliation(s)
- Jingjing Zheng
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Yurong Qian
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Jie He
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zerui Kang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lei Deng
- School of Software, Xinjiang University, Urumqi 830091, China.,School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
28
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
29
|
Wan H, Zhang J, Ding Y, Wang H, Tian G. Immunoglobulin Classification Based on FC* and GC* Features. Front Genet 2022; 12:827161. [PMID: 35140745 PMCID: PMC8819591 DOI: 10.3389/fgene.2021.827161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/22/2021] [Indexed: 11/13/2022] Open
Abstract
Immunoglobulins have a pivotal role in disease regulation. Therefore, it is vital to accurately identify immunoglobulins to develop new drugs and research related diseases. Compared with utilizing high-dimension features to identify immunoglobulins, this research aimed to examine a method to classify immunoglobulins and non-immunoglobulins using two features, FC* and GC*. Classification of 228 samples (109 immunoglobulin samples and 119 non-immunoglobulin samples) revealed that the overall accuracy was 80.7% in 10-fold cross-validation using the J48 classifier implemented in Weka software. The FC* feature identified in this study was found in the immunoglobulin subtype domain, which demonstrated that this extracted feature could represent functional and structural properties of immunoglobulins for forecasting.
Collapse
Affiliation(s)
- Hao Wan
- Institute of Advanced Cross-field Science, College of Life Science, Qingdao University, Qingdao, China
| | - Jina Zhang
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Hetian Wang
- Beidahuang Industry Group General Hospital, Harbin, China
- *Correspondence: Hetian Wang, ; Geng Tian,
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China
- *Correspondence: Hetian Wang, ; Geng Tian,
| |
Collapse
|