1
|
Lu X, Xie L, Xu L, Mao R, Xu X, Chang S. Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Comput Struct Biotechnol J 2024; 23:1666-1679. [PMID: 38680871 PMCID: PMC11046066 DOI: 10.1016/j.csbj.2024.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 05/01/2024] Open
Abstract
Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, mono-modal learning is inherently limited as it relies solely on a single modality of molecular representation, which restricts a comprehensive understanding of drug molecules. To overcome the limitations, we propose a multimodal fused deep learning (MMFDL) model to leverage information from different molecular representations. Specifically, we construct a triple-modal learning model by employing Transformer-Encoder, Bidirectional Gated Recurrent Unit (BiGRU), and graph convolutional network (GCN) to process three modalities of information from chemical language and molecular graph: SMILES-encoded vectors, ECFP fingerprints, and molecular graphs, respectively. We evaluate the proposed triple-modal model using five fusion approaches on six molecule datasets, including Delaney, Llinas2020, Lipophilicity, SAMPL, BACE, and pKa from DataWarrior. The results show that the MMFDL model achieves the highest Pearson coefficients, and stable distribution of Pearson coefficients in the random splitting test, outperforming mono-modal models in accuracy and reliability. Furthermore, we validate the generalization ability of our model in the prediction of binding constants for protein-ligand complex molecules, and assess the resilience capability against noise. Through analysis of feature distributions in chemical space and the assigned contribution of each modal model, we demonstrate that the MMFDL model shows the ability to acquire complementary information by using proper models and suitable fusion approaches. By leveraging diverse sources of bioinformatics information, multimodal deep learning models hold the potential for successful drug discovery.
Collapse
Affiliation(s)
- Xiaohua Lu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Rongzhi Mao
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
2
|
Tang T, Zhang X, Li W, Wang Q, Liu Y, Cao X. Co-training based prediction of multi-label protein-protein interactions. Comput Biol Med 2024; 177:108623. [PMID: 38788374 DOI: 10.1016/j.compbiomed.2024.108623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 05/01/2024] [Accepted: 05/16/2024] [Indexed: 05/26/2024]
Abstract
Prediction of protein-protein interaction (PPI) types enhances the comprehension of the underlying structural characteristics and functions of proteins, which gives rise to a multi-label classification problem. The nominal features describe the physicochemical characteristics of proteins directly, establishing a more robust correlation with the interaction types between proteins than ordered features. Motivated by this, we propose a multi-label PPI prediction model referred to as CoMPPI (Co-training based Multi-Label prediction of Protein-Protein Interaction). This approach aims to maximize the utility of both ordered and nominal features extracted from protein sequences. Specifically, CoMPPI incorporates graph convolutional network (GCN) and 1D convolution operation to process the complementary subsets of features individually, leveraging both local and contextualized information in a more efficient way. In addition, two multi-type PPI datasets were constructed to eliminate the duplication in previous datasets. We compare the performance of CoMPPI with three state-of-the-art methods on three datasets partitioned using distinct schemes (Breadth-first search, Depth-first search, and Random), CoMPPI consistently outperforms the other methods across all cases, demonstrating improvements ranging from 3.81% to 32.40% in Micro-F1. The subsequent ablation experiment confirms the efficacy of employing the co-training framework for multi-label PPI prediction, indicating promising avenues for future advancements in this domain.
Collapse
Affiliation(s)
- Tao Tang
- School of Modern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Nanjing, 210023, Jiangsu, China
| | - Xiaocai Zhang
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, Singapore, 138632, Singapore
| | - Weizhuo Li
- School of Modern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Nanjing, 210023, Jiangsu, China
| | - Qing Wang
- School of Management, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Nanjing, 210023, Jiangsu, China
| | - Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, 2 Lushan Rd, Changsha, 410086, Hunan, China; Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China.
| | - Xiaofeng Cao
- School of Artificial Intelligence, Jilin University, 2699 Qianjin St, Jilin, 130012, Changchun, China
| |
Collapse
|
3
|
Lin J, Hong B, Cai Z, Lu P, Lin K. MASMDDI: multi-layer adaptive soft-mask graph neural network for drug-drug interaction prediction. Front Pharmacol 2024; 15:1369403. [PMID: 38831885 PMCID: PMC11144894 DOI: 10.3389/fphar.2024.1369403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 04/23/2024] [Indexed: 06/05/2024] Open
Abstract
Accurately predicting Drug-Drug Interaction (DDI) is a critical and challenging aspect of the drug discovery process, particularly in preventing adverse reactions in patients undergoing combination therapy. However, current DDI prediction methods often overlook the interaction information between chemical substructures of drugs, focusing solely on the interaction information between drugs and failing to capture sufficient chemical substructure details. To address this limitation, we introduce a novel DDI prediction method: Multi-layer Adaptive Soft Mask Graph Neural Network (MASMDDI). Specifically, we first design a multi-layer adaptive soft mask graph neural network to extract substructures from molecular graphs. Second, we employ an attention mechanism to mine substructure feature information and update latent features. In this process, to optimize the final feature representation, we decompose drug-drug interactions into pairwise interaction correlations between the core substructures of each drug. Third, we use these features to predict the interaction probabilities of DDI tuples and evaluate the model using real-world datasets. Experimental results demonstrate that the proposed model outperforms state-of-the-art methods in DDI prediction. Furthermore, MASMDDI exhibits excellent performance in predicting DDIs of unknown drugs in two tasks that are more aligned with real-world scenarios. In particular, in the transductive scenario using the DrugBank dataset, the ACC and AUROC and AUPRC scores of MASMDDI are 0.9596, 0.9903, and 0.9894, which are 2% higher than the best performing baseline.
Collapse
Affiliation(s)
- Junpeng Lin
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| | - Binsheng Hong
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| | - Zhongqi Cai
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| | - Ping Lu
- School of Economics and Management, Xiamen University of Technology, Xiamen, China
| | - Kaibiao Lin
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
| |
Collapse
|
4
|
Chen W, Zhang Y, Wu W, Yang H, Huang W. Machine learning-based predictive model for abdominal diseases using physical examination datasets. Comput Biol Med 2024; 173:108249. [PMID: 38531251 DOI: 10.1016/j.compbiomed.2024.108249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 02/21/2024] [Accepted: 03/06/2024] [Indexed: 03/28/2024]
Abstract
Abdominal ultrasound is a key non-invasive imaging method for diagnosing liver, kidney, and gallbladder diseases, despite its clinical significance, not all individuals can undergo abdominal ultrasonography during routine health check-ups due to limitations in equipment, cost, and time. This study aims to use basic physical examination data to predict the risk of diseases of the liver, kidney, and gallbladder that can be diagnosed via abdominal ultrasound. Basic physical examination data contain gender, age, height, weight, BMI, pulse, systolic blood pressure (SBP), diastolic blood pressure (DBP), high-density lipoprotein (HDL), low-density lipoprotein (LDL), total cholesterol, triglycerides, fasting blood glucose (FBG), and uric acid-we established seven single-label predictive models and one multi-label predictive model. These models were specifically designed to predict a range of abdominal diseases. The single-label models, utilizing the XGBoost algorithm, targeted diseases such as fatty liver (with an Area Under the Curve (AUC) of 0.9344), liver deposits (AUC: 0.8221), liver cysts (AUC: 0.7928), gallbladder polyps (AUC: 0.7508), kidney stones (AUC: 0.7853), kidney cysts (AUC: 0.8241), and kidney crystals (AUC: 0.7536). Furthermore, a comprehensive multi-label model, capable of predicting multiple conditions simultaneously, was established by FCN and achieved an AUC of 0.6344. We conducted interpretability analysis on these models to enhance their understanding and applicability in clinical settings. The insights gained from this analysis are crucial for the development of targeted disease prevention strategies. This study represents a significant advancement in utilizing physical examination data to predict ultrasound results, offering a novel approach to early diagnosis and prevention of abdominal diseases.
Collapse
Affiliation(s)
- Wei Chen
- Zhejiang Academy of Traditional Chinese Medicine Culture, Zhejiang Chinese Medical University, Hangzhou, China; Four Provincial Marginal Traditional Chinese Medicine Hospitals (Quzhou Traditional Chinese Medicine Hospital) Affiliated to Zhejiang University of Traditional Chinese Medicine, Quzhou, China
| | - YuJie Zhang
- Zhejiang Academy of Traditional Chinese Medicine Culture, Zhejiang Chinese Medical University, Hangzhou, China
| | - Weili Wu
- Four Provincial Marginal Traditional Chinese Medicine Hospitals (Quzhou Traditional Chinese Medicine Hospital) Affiliated to Zhejiang University of Traditional Chinese Medicine, Quzhou, China
| | - Hui Yang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China.
| | - Wenxiu Huang
- Zhejiang Academy of Traditional Chinese Medicine Culture, Zhejiang Chinese Medical University, Hangzhou, China.
| |
Collapse
|
5
|
Zhong KY, Wen ML, Meng FF, Li X, Jiang B, Zeng X, Li Y. MMDTA: A Multimodal Deep Model for Drug-Target Affinity with a Hybrid Fusion Strategy. J Chem Inf Model 2024; 64:2878-2888. [PMID: 37610162 DOI: 10.1021/acs.jcim.3c00866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
The prediction of the drug-target affinity (DTA) plays an important role in evaluating molecular druggability. Although deep learning-based models for DTA prediction have been extensively attempted, there are rare reports on multimodal models that leverage various fusion strategies to exploit heterogeneous information from multiple different modalities of drugs and targets. In this study, we proposed a multimodal deep model named MMDTA, which integrated the heterogeneous information from various modalities of drugs and targets using a hybrid fusion strategy to enhance DTA prediction. To achieve this, MMDTA first employed convolutional neural networks (CNNs) and graph convolutional networks (GCNs) to extract diverse heterogeneous information from the sequences and structures of drugs and targets. It then utilized a hybrid fusion strategy to combine and complement the extracted heterogeneous information, resulting in the fused modal information for predicting drug-target affinity through the fully connected (FC) layers. Experimental results demonstrated that MMDTA outperformed the competitive state-of-the-art deep learning models on the widely used benchmark data sets, particularly with a significantly improved key evaluation metric, Root Mean Square Error (RMSE). Furthermore, MMDTA exhibited excellent generalization and practical application performance on multiple different data sets. These findings highlighted MMDTA's accuracy and reliability in predicting the drug-target binding affinity. For researchers interested in the source data and code, they are accessible at http://github.com/dldxzx/MMDTA.
Collapse
Affiliation(s)
- Kai-Yang Zhong
- College of Mathematics and Computer Science, Dali University, Dali 671003, China
| | - Meng-Liang Wen
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Yunnan University, Kunming 650000, China
| | - Fan-Fang Meng
- College of Mathematics and Computer Science, Dali University, Dali 671003, China
| | - Xin Li
- College of Mathematics and Computer Science, Dali University, Dali 671003, China
| | - Bei Jiang
- Yunnan Key Laboratory of Screening and Research on Anti-pathogenic Plant Resources from Western Yunnan, Dali University, Dali 671000, China
| | - Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali 671003, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali 671003, China
| |
Collapse
|
6
|
Iliadis D, De Baets B, Pahikkala T, Waegeman W. A comparison of embedding aggregation strategies in drug-target interaction prediction. BMC Bioinformatics 2024; 25:59. [PMID: 38321386 PMCID: PMC10845509 DOI: 10.1186/s12859-024-05684-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024] Open
Abstract
The prediction of interactions between novel drugs and biological targets is a vital step in the early stage of the drug discovery pipeline. Many deep learning approaches have been proposed over the last decade, with a substantial fraction of them sharing the same underlying two-branch architecture. Their distinction is limited to the use of different types of feature representations and branches (multi-layer perceptrons, convolutional neural networks, graph neural networks and transformers). In contrast, the strategy used to combine the outputs (embeddings) of the branches has remained mostly the same. The same general architecture has also been used extensively in the area of recommender systems, where the choice of an aggregation strategy is still an open question. In this work, we investigate the effectiveness of three different embedding aggregation strategies in the area of drug-target interaction (DTI) prediction. We formally define these strategies and prove their universal approximator capabilities. We then present experiments that compare the different strategies on benchmark datasets from the area of DTI prediction, showcasing conditions under which specific strategies could be the obvious choice.
Collapse
Affiliation(s)
- Dimitrios Iliadis
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.
| | - Bernard De Baets
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - Tapio Pahikkala
- Department of Computing, University of Turku, 20500, Turku, Finland
| | - Willem Waegeman
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| |
Collapse
|
7
|
Lin CX, Guan Y, Li HD. Artificial intelligence approaches for molecular representation in drug response prediction. Curr Opin Struct Biol 2024; 84:102747. [PMID: 38091924 DOI: 10.1016/j.sbi.2023.102747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/26/2023] [Accepted: 11/26/2023] [Indexed: 02/09/2024]
Abstract
Drug response prediction is essential for drug development and disease treatment. One key question in predicting drug response is the representation of molecules, which has been greatly advanced by artificial intelligence (AI) techniques in recent years. In this review, we first describe different types of representation methods, pinpointing their key principles and discussing their limitations. Thereafter we discuss potential ways how these methods could be further developed. We expect that this review will provide useful guidance for researchers in the community.
Collapse
Affiliation(s)
- Cui-Xiang Lin
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan Province, PR China
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Hong-Dong Li
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, PR China.
| |
Collapse
|
8
|
Dong S, Liu Y, Gong Y, Dong X, Zeng X. scCAN: Clustering With Adaptive Neighbor-Based Imputation Method for Single-Cell RNA-Seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:95-105. [PMID: 38285569 DOI: 10.1109/tcbb.2023.3337231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Single-cell RNA sequencing (scRNA-seq) is widely used to study cellular heterogeneity in different samples. However, due to technical deficiencies, dropout events often result in zero gene expression values in the gene expression matrix. In this paper, we propose a new imputation method called scCAN, based on adaptive neighborhood clustering, to estimate the zero value of dropouts. Our method continuously updates cell-cell similarity information by simultaneously learning similarity relationships, clustering structures, and imposing new rank constraints on the Laplacian matrix of the similarity matrix, improving the imputation of dropout zero values. To evaluate the performance of this method, we used four simulated and eight real scRNA-seq data for downstream analyses, including cell clustering, recovered gene expression, and reconstructed cell trajectories. Our method improves the performance of the downstream analysis and is better than other imputation methods.
Collapse
|
9
|
Sun H, Wang J, Wu H, Lin S, Chen J, Wei J, Lv S, Xiong Y, Wei DQ. A Multimodal Deep Learning Framework for Predicting PPI-Modulator Interactions. J Chem Inf Model 2023; 63:7363-7372. [PMID: 38037990 DOI: 10.1021/acs.jcim.3c01527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Protein-protein interactions (PPIs) are essential for various biological processes and diseases. However, most existing computational methods for identifying PPI modulators require either target structure or reference modulators, which restricts their applicability to novel PPI targets. To address this challenge, we propose MultiPPIMI, a sequence-based deep learning framework that predicts the interaction between any given PPI target and modulator. MultiPPIMI integrates multimodal representations of PPI targets and modulators and uses a bilinear attention network to capture intermolecular interactions. Experimental results on our curated benchmark data set show that MultiPPIMI achieves an average AUROC of 0.837 in three cold-start scenarios and an AUROC of 0.994 in the random-split scenario. Furthermore, the case study shows that MultiPPIMI can assist molecular docking simulations in screening inhibitors of Keap1/Nrf2 PPI interactions. We believe that the proposed method provides a promising way to screen PPI-targeted modulators.
Collapse
Affiliation(s)
- Heqi Sun
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Hongyan Wu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junwei Chen
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jinghua Wei
- Department of Chemistry, University of Toronto, Toronto M5R 0A3, Canada
| | - Shuai Lv
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Peng Cheng National Laboratory, Shenzhen 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Nanyang 473006, China
| |
Collapse
|
10
|
Liyaqat T, Ahmad T, Saxena C. TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection. J Comput Aided Mol Des 2023; 37:573-584. [PMID: 37777631 DOI: 10.1007/s10822-023-00533-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 09/07/2023] [Indexed: 10/02/2023]
Abstract
Drug discovery, especially virtual screening and drug repositioning, can be accelerated through deeper understanding and prediction of Drug Target Interactions (DTIs). The advancement of deep learning as well as the time and financial costs associated with conventional wet-lab experiments have made computational methods for DTI prediction more popular. However, the majority of these computational methods handle the DTI problem as a binary classification task, ignoring the quantitative binding affinity that determines the drug efficacy to their target proteins. Moreover, computational space as well as execution time of the model is often ignored over accuracy. To address these challenges, we introduce a novel method, called Time-efficient Multimodal Drug Target Binding Affinity (TeM-DTBA), which predicts the binding affinity between drugs and targets by fusing different modalities based on compound structures and target sequences. We employ the Lasso feature selection method, which lowers the dimensionality of feature vectors and speeds up the proposed model training time by more than 50%. The results from two benchmark datasets demonstrate that our method outperforms state-of-the-art methods in terms of performance. The mean squared errors of 18.8% and 23.19%, achieved on the KIBA and Davis datasets, respectively, suggest that our method is more accurate in predicting drug-target binding affinity.
Collapse
Affiliation(s)
- Tanya Liyaqat
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India.
| | - Tanvir Ahmad
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India
| | - Chandni Saxena
- The Chinese University of Hong Kong, Sha Tin, SAR, China
| |
Collapse
|
11
|
Tao W, Liu Y, Lin X, Song B, Zeng X. Prediction of multi-relational drug-gene interaction via Dynamic hyperGraph Contrastive Learning. Brief Bioinform 2023; 24:bbad371. [PMID: 37864294 DOI: 10.1093/bib/bbad371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/11/2023] [Accepted: 09/29/2023] [Indexed: 10/22/2023] Open
Abstract
Drug-gene interaction prediction occupies a crucial position in various areas of drug discovery, such as drug repurposing, lead discovery and off-target detection. Previous studies show good performance, but they are limited to exploring the binding interactions and ignoring the other interaction relationships. Graph neural networks have emerged as promising approaches owing to their powerful capability of modeling correlations under drug-gene bipartite graphs. Despite the widespread adoption of graph neural network-based methods, many of them experience performance degradation in situations where high-quality and sufficient training data are unavailable. Unfortunately, in practical drug discovery scenarios, interaction data are often sparse and noisy, which may lead to unsatisfactory results. To undertake the above challenges, we propose a novel Dynamic hyperGraph Contrastive Learning (DGCL) framework that exploits local and global relationships between drugs and genes. Specifically, graph convolutions are adopted to extract explicit local relations among drugs and genes. Meanwhile, the cooperation of dynamic hypergraph structure learning and hypergraph message passing enables the model to aggregate information in a global region. With flexible global-level messages, a self-augmented contrastive learning component is designed to constrain hypergraph structure learning and enhance the discrimination of drug/gene representations. Experiments conducted on three datasets show that DGCL is superior to eight state-of-the-art methods and notably gains a 7.6% performance improvement on the DGIdb dataset. Further analyses verify the robustness of DGCL for alleviating data sparsity and over-smoothing issues.
Collapse
Affiliation(s)
- Wen Tao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082 Hunan, China
| | - Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082 Hunan, China
| | - Xuan Lin
- School of Computer Science, Xiangtan University, Xiangtan, 411105 Hunan, China
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education (Xiangtan University), Xiangtan, 411105 Hunan, China
| | - Bosheng Song
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082 Hunan, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082 Hunan, China
| |
Collapse
|
12
|
Han Y, Liu M, Wang Z. Key protein identification by integrating protein complex information and multi-biological features. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:18191-18206. [PMID: 38052554 DOI: 10.3934/mbe.2023808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Identifying key proteins based on protein-protein interaction networks has emerged as a prominent area of research in bioinformatics. However, current methods exhibit certain limitations, such as the omission of subcellular localization information and the disregard for the impact of topological structure noise on the reliability of key protein identification. Moreover, the influence of proteins outside a complex but interacting with proteins inside the complex on complex participation tends to be overlooked. Addressing these shortcomings, this paper presents a novel method for key protein identification that integrates protein complex information with multiple biological features. This approach offers a comprehensive evaluation of protein importance by considering subcellular localization centrality, topological centrality weighted by gene ontology (GO) similarity and complex participation centrality. Experimental results, including traditional statistical metrics, jackknife methodology metric and key protein overlap or difference, demonstrate that the proposed method not only achieves higher accuracy in identifying key proteins compared to nine classical methods but also exhibits robustness across diverse protein-protein interaction networks.
Collapse
Affiliation(s)
- Yongyin Han
- School of Computer Science and Technology, China University of Mining and Technology, China
- Xuzhou College of Industrial Technology, China
| | - Maolin Liu
- School of Computer Science and Technology, China University of Mining and Technology, China
| | - Zhixiao Wang
- School of Computer Science and Technology, China University of Mining and Technology, China
| |
Collapse
|
13
|
Lin X, Dai L, Zhou Y, Yu ZG, Zhang W, Shi JY, Cao DS, Zeng L, Chen H, Song B, Yu PS, Zeng X. Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction. Brief Bioinform 2023:bbad235. [PMID: 37401373 DOI: 10.1093/bib/bbad235] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/30/2023] [Accepted: 06/05/2023] [Indexed: 07/05/2023] Open
Abstract
Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.
Collapse
Affiliation(s)
- Xuan Lin
- College of Computer Science, Xiangtan University, Xiangtan, China
| | - Lichang Dai
- College of Computer Science, Xiangtan University, Xiangtan, China
| | - Yafang Zhou
- College of Computer Science, Xiangtan University, Xiangtan, China
| | - Zu-Guo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, China
| | - Jian-Yu Shi
- Northwestern Polytechnical University, Xian, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| | - Li Zeng
- AIDD department of Yuyao Biotech, Shanghai, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, 410013 Changsha, P. R. China
| | - Bosheng Song
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Philip S Yu
- University of Illinois at Chicago and also holds the Wexler Chair in Information Technology
| | - Xiangxiang Zeng
- College of Information Science and Engineering, Hunan University, Changsha, China
| |
Collapse
|