1
|
Lu Q, Zhou Z, Wang Q. Multi-layer graph attention neural networks for accurate drug-target interaction mapping. Sci Rep 2024; 14:26119. [PMID: 39478027 PMCID: PMC11525987 DOI: 10.1038/s41598-024-75742-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 10/08/2024] [Indexed: 11/02/2024] Open
Abstract
In the crucial process of drug discovery and repurposing, precise prediction of drug-target interactions (DTIs) is paramount. This study introduces a novel DTI prediction approach-Multi-Layer Graph Attention Neural Network (MLGANN), through a groundbreaking computational framework that effectively harnesses multi-source information to enhance prediction accuracy. MLGANN not only strides forward in constructing a multi-layer DTI network by capturing both direct interactions between drugs and targets as well as their multi-level information but also amalgamates Graph Convolutional Networks (GCN) with a self-attention mechanism to comprehensively integrate diverse data sources. This method exhibited significant performance surpassing existing approaches in comparative experiments, underscoring its immense potential in elevating the efficiency and accuracy of DTI predictions. More importantly, this study accentuates the significance of considering multi-source data information and network heterogeneity in the drug discovery process, offering new perspectives and tools for future pharmaceutical research.
Collapse
Affiliation(s)
- Qianwen Lu
- SDU-ANU Joint Science College, Shandong University, Weihai, 264209, Shandong, China
| | - Zhiheng Zhou
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100190, China
| | - Qi Wang
- College of Science, China Agricultural University, Beijing, 100083, China.
| |
Collapse
|
2
|
Luo Y, Duan G, Zhao Q, Bi X, Wang J. DTKGIN: Predicting drug-target interactions based on knowledge graph and intent graph. Methods 2024; 226:21-27. [PMID: 38608849 DOI: 10.1016/j.ymeth.2024.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 01/16/2024] [Accepted: 04/09/2024] [Indexed: 04/14/2024] Open
Abstract
Knowledge graph intent graph attention mechanism Predicting drug-target interactions (DTIs) plays a crucial role in drug discovery and drug development. Considering the high cost and risk of biological experiments, developing computational approaches to explore the interactions between drugs and targets can effectively reduce the time and cost of drug development. Recently, many methods have made significant progress in predicting DTIs. However, existing approaches still suffer from the high sparsity of DTI datasets and the cold start problem. In this paper, we develop a new model to predict drug-target interactions via a knowledge graph and intent graph named DTKGIN. Our method can effectively capture biological environment information for targets and drugs by mining their associated relations in the knowledge graph and considering drug-target interactions at a fine-grained level in the intent graph. DTKGIN learns the representation of drugs and targets from the knowledge graph and the intent graph. Then the probabilities of interactions between drugs and targets are obtained through the inner product of the representation of drugs and targets. Experimental results show that our proposed method outperforms other state-of-the-art methods in 10-fold cross-validation, especially in cold-start experimental settings. Furthermore, the case studies demonstrate the effectiveness of DTKGIN in predicting potential drug-target interactions. The code is available on GitHub: https://github.com/Royluoyi123/DTKGIN.
Collapse
Affiliation(s)
- Yi Luo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China.
| | - Qichang Zhao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Xuehua Bi
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| |
Collapse
|
3
|
Liu Y, Xing L, Zhang L, Cai H, Guo M. GEFormerDTA: drug target affinity prediction based on transformer graph for early fusion. Sci Rep 2024; 14:7416. [PMID: 38548825 PMCID: PMC10979032 DOI: 10.1038/s41598-024-57879-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 03/22/2024] [Indexed: 04/01/2024] Open
Abstract
Predicting the interaction affinity between drugs and target proteins is crucial for rapid and accurate drug discovery and repositioning. Therefore, more accurate prediction of DTA has become a key area of research in the field of drug discovery and drug repositioning. However, traditional experimental methods have disadvantages such as long operation cycles, high manpower requirements, and high economic costs, making it difficult to predict specific interactions between drugs and target proteins quickly and accurately. Some methods mainly use the SMILES sequence of drugs and the primary structure of proteins as inputs, ignoring the graph information such as bond encoding, degree centrality encoding, spatial encoding of drug molecule graphs, and the structural information of proteins such as secondary structure and accessible surface area. Moreover, previous methods were based on protein sequences to learn feature representations, neglecting the completeness of information. To address the completeness of drug and protein structure information, we propose a Transformer graph-based early fusion research approach for drug-target affinity prediction (GEFormerDTA). Our method reduces prediction errors caused by insufficient feature learning. Experimental results on Davis and KIBA datasets showed a better prediction of drugtarget affinity than existing affinity prediction methods.
Collapse
Affiliation(s)
- Youzhi Liu
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
| | - Linlin Xing
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China.
| | - Longbo Zhang
- Department of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
| | - Hongzhen Cai
- Department of Agricultural Engineering and Food Science, Shandong University of Technology, Zibo, 255000, China
| | - Maozu Guo
- Department of Electrical and Information Engineering, Beijing University of Architecture, Beijing, 102616, China
| |
Collapse
|
4
|
Ren ZH, You ZH, Zou Q, Yu CQ, Ma YF, Guan YJ, You HR, Wang XF, Pan J. DeepMPF: deep learning framework for predicting drug-target interactions based on multi-modal representation with meta-path semantic analysis. J Transl Med 2023; 21:48. [PMID: 36698208 PMCID: PMC9876420 DOI: 10.1186/s12967-023-03876-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 01/05/2023] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction has become a crucial prerequisite in drug design and drug discovery. However, the traditional biological experiment is time-consuming and expensive, as there are abundant complex interactions present in the large size of genomic and chemical spaces. For alleviating this phenomenon, plenty of computational methods are conducted to effectively complement biological experiments and narrow the search spaces into a preferred candidate domain. Whereas, most of the previous approaches cannot fully consider association behavior semantic information based on several schemas to represent complex the structure of heterogeneous biological networks. Additionally, the prediction of DTI based on single modalities cannot satisfy the demand for prediction accuracy. METHODS We propose a multi-modal representation framework of 'DeepMPF' based on meta-path semantic analysis, which effectively utilizes heterogeneous information to predict DTI. Specifically, we first construct protein-drug-disease heterogeneous networks composed of three entities. Then the feature information is obtained under three views, containing sequence modality, heterogeneous structure modality and similarity modality. We proposed six representative schemas of meta-path to preserve the high-order nonlinear structure and catch hidden structural information of the heterogeneous network. Finally, DeepMPF generates highly representative comprehensive feature descriptors and calculates the probability of interaction through joint learning. RESULTS To evaluate the predictive performance of DeepMPF, comparison experiments are conducted on four gold datasets. Our method can obtain competitive performance in all datasets. We also explore the influence of the different feature embedding dimensions, learning strategies and classification methods. Meaningfully, the drug repositioning experiments on COVID-19 and HIV demonstrate DeepMPF can be applied to solve problems in reality and help drug discovery. The further analysis of molecular docking experiments enhances the credibility of the drug candidates predicted by DeepMPF. CONCLUSIONS All the results demonstrate the effectively predictive capability of DeepMPF for drug-target interactions. It can be utilized as a useful tool to prescreen the most potential drug candidates for the protein. The web server of the DeepMPF predictor is freely available at http://120.77.11.78/DeepMPF/ , which can help relevant researchers to further study.
Collapse
Affiliation(s)
- Zhong-Hao Ren
- grid.460132.20000 0004 1758 0275School of Information Engineering, Xijing University, Xi’an, 710100 China
| | - Zhu-Hong You
- grid.440588.50000 0001 0307 1240School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129 China
| | - Quan Zou
- grid.54549.390000 0004 0369 4060Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054 China
| | - Chang-Qing Yu
- grid.460132.20000 0004 1758 0275School of Information Engineering, Xijing University, Xi’an, 710100 China
| | - Yan-Fang Ma
- grid.417234.70000 0004 1808 3203Department of Galactophore, The Third People’s Hospital of Gansu Province, Lanzhou, 730020 China
| | - Yong-Jian Guan
- grid.460132.20000 0004 1758 0275School of Information Engineering, Xijing University, Xi’an, 710100 China
| | - Hai-Ru You
- grid.440588.50000 0001 0307 1240School of Computer Science, Northwestern Polytechnical University, Xi’an, 710129 China
| | - Xin-Fei Wang
- grid.460132.20000 0004 1758 0275School of Information Engineering, Xijing University, Xi’an, 710100 China
| | - Jie Pan
- grid.460132.20000 0004 1758 0275School of Information Engineering, Xijing University, Xi’an, 710100 China
| |
Collapse
|
5
|
Peng Y, Zhao S, Zeng Z, Hu X, Yin Z. LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions. Front Microbiol 2023; 13:1092467. [PMID: 36687573 PMCID: PMC9849804 DOI: 10.3389/fmicb.2022.1092467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/07/2022] [Indexed: 01/07/2023] Open
Abstract
Prediction of drug-target interactions (DTIs) plays an important role in drug development. However, traditional laboratory methods to determine DTIs require a lot of time and capital costs. In recent years, many studies have shown that using machine learning methods to predict DTIs can speed up the drug development process and reduce capital costs. An excellent DTI prediction method should have both high prediction accuracy and low computational cost. In this study, we noticed that the previous research based on deep forests used XGBoost as the estimator in the cascade, we applied LightGBM instead of XGBoost to the cascade forest as the estimator, then the estimator group was determined experimentally as three LightGBMs and three ExtraTrees, this new model is called LGBMDF. We conducted 5-fold cross-validation on LGBMDF and other state-of-the-art methods using the same dataset, and compared their Sn, Sp, MCC, AUC and AUPR. Finally, we found that our method has better performance and faster calculation speed.
Collapse
|
6
|
Liu B, Papadopoulos D, Malliaros FD, Tsoumakas G, Papadopoulos AN. Multiple similarity drug-target interaction prediction with random walks and matrix factorization. Brief Bioinform 2022; 23:6692553. [PMID: 36070659 DOI: 10.1093/bib/bbac353] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/11/2022] [Accepted: 07/27/2022] [Indexed: 11/14/2022] Open
Abstract
The discovery of drug-target interactions (DTIs) is a very promising area of research with great potential. The accurate identification of reliable interactions among drugs and proteins via computational methods, which typically leverage heterogeneous information retrieved from diverse data sources, can boost the development of effective pharmaceuticals. Although random walk and matrix factorization techniques are widely used in DTI prediction, they have several limitations. Random walk-based embedding generation is usually conducted in an unsupervised manner, while the linear similarity combination in matrix factorization distorts individual insights offered by different views. To tackle these issues, we take a multi-layered network approach to handle diverse drug and target similarities, and propose a novel optimization framework, called Multiple similarity DeepWalk-based Matrix Factorization (MDMF), for DTI prediction. The framework unifies embedding generation and interaction prediction, learning vector representations of drugs and targets that not only retain higher order proximity across all hyper-layers and layer-specific local invariance, but also approximate the interactions with their inner product. Furthermore, we develop an ensemble method (MDMF2A) that integrates two instantiations of the MDMF model, optimizing the area under the precision-recall curve (AUPR) and the area under the receiver operating characteristic curve (AUC), respectively. The empirical study on real-world DTI datasets shows that our method achieves statistically significant improvement over current state-of-the-art approaches in four different settings. Moreover, the validation of highly ranked non-interacting pairs also demonstrates the potential of MDMF2A to discover novel DTIs.
Collapse
Affiliation(s)
- Bin Liu
- Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
- School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | | | - Fragkiskos D Malliaros
- Paris-Saclay University, CentraleSupélec, Inria, Centre for Visual Computing (CVN), 91190 Gif-Sur-Yvette, France
| | - Grigorios Tsoumakas
- School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | | |
Collapse
|
7
|
Kusuma WA, Habibi ZI, Amir MF, Fadli A, Khotimah H, Dewanto V, Heryanto R. Bipartite graph search optimization for type II diabetes mellitus Jamu formulation using branch and bound algorithm. Front Pharmacol 2022; 13:978741. [PMID: 36034833 PMCID: PMC9403330 DOI: 10.3389/fphar.2022.978741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 07/14/2022] [Indexed: 11/26/2022] Open
Abstract
Jamu is an Indonesian traditional herbal medicine that has been practiced for generations. Jamu is made from various medicinal plants. Each plant has several compounds directly related to the target protein that are directly associated with a disease. A pharmacological graph can form relationships between plants, compounds, and target proteins. Research related to the prediction of Jamu formulas for some diseases has been carried out, but there are problems in finding combinations or compositions of Jamu formulas because of the increase in search space size. Some studies adopted the drug–target interaction (DTI) implemented using machine learning or deep learning to predict the DTI for discovering the Jamu formula. However, this approach raises important issues, such as imbalanced and high-dimensional dataset, overfitting, and the need for more procedures to trace compounds to their plants. This study proposes an alternative approach by implementing bipartite graph search optimization using the branch and bound algorithm to discover the combination or composition of Jamu formulas by optimizing the search on a plant–protein bipartite graph. The branch and bound technique is implemented using the search strategy of breadth first search (BrFS), Depth First Search, and Best First Search. To show the performance of the proposed method, we compared our method with a complete search algorithm, searching all nodes in the tree without pruning. In this study, we specialize in applying the proposed method to search for the Jamu formula for type II diabetes mellitus (T2DM). The result shows that the bipartite graph search with the branch and bound algorithm reduces computation time up to 40 times faster than the complete search strategy to search for a composition of plants. The binary branching strategy is the best choice, whereas the BrFS strategy is the best option in this research. In addition, the the proposed method can suggest the composition of one to four plants for the T2DM Jamu formula. For a combination of four plants, we obtain Angelica Sinensis, Citrus aurantium, Glycyrrhiza uralensis, and Mangifera indica. This approach is expected to be an alternative way to discover the Jamu formula more accurately.
Collapse
Affiliation(s)
- Wisnu Ananta Kusuma
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
- Tropical Biopharmaca Research Center, IPB University, Bogor, Indonesia
- *Correspondence: Wisnu Ananta Kusuma,
| | - Zulfahmi Ibnu Habibi
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| | - Muhammad Fahmi Amir
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| | - Aulia Fadli
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| | - Husnul Khotimah
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| | - Vektor Dewanto
- Department of Computer Science, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| | - Rudi Heryanto
- Tropical Biopharmaca Research Center, IPB University, Bogor, Indonesia
- Department of Chemistry, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
| |
Collapse
|
8
|
Machine Learning-Based Intelligent Scoring of College English Teaching in the Field of Natural Language Processing. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2754626. [PMID: 35965747 PMCID: PMC9371845 DOI: 10.1155/2022/2754626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 05/16/2022] [Accepted: 05/31/2022] [Indexed: 11/17/2022]
Abstract
The current education evaluation is limited not only to the mode of simplification, indexing, and datafication, but also to the scientific nature of college teaching evaluation. This work firstly conducts a theoretical analysis of natural language processing technology, analyzes the related technologies of intelligent scoring, designs a systematic process for intelligent scoring of college English teaching, and finally conducts theoretical research on the Naive Bayesian algorithm in machine learning. In addition, the error of intelligent scoring of English teaching in colleges and universities and the accuracy of scoring and classification are analyzed and researched. The results show that the error between manual scoring and machine scoring is basically about 2 points and the minimum error of intelligent scoring in college English teaching under machine scoring can reach 0 points. There is a certain bias in manual scoring, and scoring on the machine can reduce the generation of this error. The Naive Bayes algorithm has the highest classification accuracy on the college intelligent scoring dataset, which is 76.43%. The weighted Naive Bayes algorithm has been improved in the classification accuracy of college English teaching intelligent scoring, with an average accuracy rate of 74.87%. To sum up, the weighted Naive Bayes algorithm has better performance in the classification accuracy of college English intelligent scoring. This work has a significant effect on the scoring of the college intelligent teaching scoring system under natural language processing and the classification of college teaching intelligence scoring under the Naive Bayes algorithm, which can improve the efficiency of college teaching scoring.
Collapse
|
9
|
Qian W, Xiong C, Qian Y, Wang Y. Label enhancement-based feature selection via fuzzy neighborhood discrimination index. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
10
|
Drug-target interaction prediction via an ensemble of weighted nearest neighbors with interaction recovery. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02495-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
11
|
Screening of Potential Indonesia Herbal Compounds Based on Multi-Label Classification for 2019 Coronavirus Disease. BIG DATA AND COGNITIVE COMPUTING 2021. [DOI: 10.3390/bdcc5040075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in drug repurposing can be done by observing the interactions of drug compounds with protein related to a disease (DTI), then predicting the new drug-target interactions. This study conducted multilabel DTI prediction using the stack autoencoder-deep neural network (SAE-DNN) algorithm. Compound features were extracted using PubChem fingerprint, daylight fingerprint, MACCS fingerprint, and circular fingerprint. The results showed that the SAE-DNN model was able to predict DTI in COVID-19 cases with good performance. The SAE-DNN model with a circular fingerprint dataset produced the best average metrics with an accuracy of 0.831, recall of 0.918, precision of 0.888, and F-measure of 0.89. Herbal compounds prediction results using the SAE-DNN model with the circular, daylight, and PubChem fingerprint dataset resulted in 92, 65, and 79 herbal compounds contained in herbal plants in Indonesia respectively.
Collapse
|
12
|
Prediction of Drug-Target Interactions by Combining Dual-Tree Complex Wavelet Transform with Ensemble Learning Method. Molecules 2021; 26:molecules26175359. [PMID: 34500792 PMCID: PMC8433937 DOI: 10.3390/molecules26175359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 08/27/2021] [Accepted: 08/30/2021] [Indexed: 11/17/2022] Open
Abstract
Identification of drug–target interactions (DTIs) is vital for drug discovery. However, traditional biological approaches have some unavoidable shortcomings, such as being time consuming and expensive. Therefore, there is an urgent need to develop novel and effective computational methods to predict DTIs in order to shorten the development cycles of new drugs. In this study, we present a novel computational approach to identify DTIs, which uses protein sequence information and the dual-tree complex wavelet transform (DTCWT). More specifically, a position-specific scoring matrix (PSSM) was performed on the target protein sequence to obtain its evolutionary information. Then, DTCWT was used to extract representative features from the PSSM, which were then combined with the drug fingerprint features to form the feature descriptors. Finally, these descriptors were sent to the Rotation Forest (RoF) model for classification. A 5-fold cross validation (CV) was adopted on four datasets (Enzyme, Ion Channel, GPCRs (G-protein-coupled receptors), and NRs (Nuclear Receptors)) to validate the proposed model; our method yielded high average accuracies of 89.21%, 85.49%, 81.02%, and 74.44%, respectively. To further verify the performance of our model, we compared the RoF classifier with two state-of-the-art algorithms: the support vector machine (SVM) and the k-nearest neighbor (KNN) classifier. We also compared it with some other published methods. Moreover, the prediction results for the independent dataset further indicated that our method is effective for predicting potential DTIs. Thus, we believe that our method is suitable for facilitating drug discovery and development.
Collapse
|
13
|
An Ensemble Learning-Based Method for Inferring Drug-Target Interactions Combining Protein Sequences and Drug Fingerprints. BIOMED RESEARCH INTERNATIONAL 2021; 2021:9933873. [PMID: 33987446 PMCID: PMC8093043 DOI: 10.1155/2021/9933873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/14/2021] [Accepted: 04/16/2021] [Indexed: 11/24/2022]
Abstract
Identifying the interactions of the drug-target is central to the cognate areas including drug discovery and drug reposition. Although the high-throughput biotechnologies have made tremendous progress, the indispensable clinical trials remain to be expensive, laborious, and intricate. Therefore, a convenient and reliable computer-aided method has become the focus on inferring drug-target interactions (DTIs). In this research, we propose a novel computational model integrating a pyramid histogram of oriented gradients (PHOG), Position-Specific Scoring Matrix (PSSM), and rotation forest (RF) classifier for identifying DTIs. Specifically, protein primary sequences are first converted into PSSMs to describe the potential biological evolution information. After that, PHOG is employed to mine the highly representative features of PSSM from multiple pyramid levels, and the complete describers of drug-target pairs are generated by combining the molecular substructure fingerprints and PHOG features. Finally, we feed the complete describers into the RF classifier for effective prediction. The experiments of 5-fold Cross-Validations (CV) yield mean accuracies of 88.96%, 86.37%, 82.88%, and 76.92% on four golden standard data sets (enzyme, ion channel, G protein-coupled receptors (GPCRs), and nuclear receptor, respectively). Moreover, the paper also conducts the state-of-art light gradient boosting machine (LGBM) and support vector machine (SVM) to further verify the performance of the proposed model. The experimental outcomes substantiate that the established model is feasible and reliable to predict DTIs. There is an excellent prospect that our model is capable of predicting DTIs as an efficient tool on a large scale.
Collapse
|
14
|
Ma Y, Li Q, Hu N, Li L. SeBioGraph: Semi-supervised Deep Learning for the Graph via Sustainable Knowledge Transfer. Front Neurorobot 2021; 15:665055. [PMID: 33867966 PMCID: PMC8047129 DOI: 10.3389/fnbot.2021.665055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 03/09/2021] [Indexed: 11/17/2022] Open
Abstract
Semi-supervised deep learning for the biomedical graph and advanced manufacturing graph is rapidly becoming an important topic in both academia and industry. Many existing types of research focus on semi-supervised link prediction and node classification, as well as the application of these methods in sustainable development and advanced manufacturing. To date, most manufacturing graph neural networks are mainly evaluated on social and information networks, which improve the quality of network representation y integrating neighbor node descriptions. However, previous methods have not yet been comprehensively studied on biomedical networks. Traditional techniques fail to achieve satisfying results, especially when labeled nodes are deficient in number. In this paper, a new semi-supervised deep learning method for the biomedical graph via sustainable knowledge transfer called SeBioGraph is proposed. In SeBioGraph, both node embedding and graph-specific prototype embedding are utilized as transferable metric space characterized. By incorporating prior knowledge learned from auxiliary graphs, SeBioGraph further promotes the performance of the target graph. Experimental results on the two-class node classification tasks and three-class link prediction tasks demonstrate that the SeBioGraph realizes state-of-the-art results. Finally, the method is thoroughly evaluated.
Collapse
Affiliation(s)
- Yugang Ma
- School of Architecture and Urban Planning, Chongqing University, Chongqing, China
| | - Qing Li
- School of Computer Science, Northwestern Polytechnical University, Shaanxi, China
| | - Nan Hu
- School of Management Science and Real Estate, Chongqing University, Chongqing, China
| | - Lili Li
- China Construction Science & Technology Group Co., Ltd. Shenzhen, China.,College of Civil and Environmental Engineering, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
15
|
Afanasyeva A, Nagao C, Mizuguchi K. Developing a Kinase-Specific Target Selection Method Using a Structure-Based Machine Learning Approach. Adv Appl Bioinform Chem 2020; 13:27-40. [PMID: 33293834 PMCID: PMC7719317 DOI: 10.2147/aabc.s278900] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 11/13/2020] [Indexed: 12/21/2022] Open
Abstract
Introduction Despite recent advances in the drug discovery field, developing selective kinase inhibitors remains a complicated issue for a number of reasons, one of which is that there are striking structural similarities in the ATP-binding pockets of kinases. Objective To address this problem, we have designed a machine learning model utilizing various structure-based and energy-based descriptors to better characterize protein–ligand interactions. Methods In this work, we use a dataset of 104 human kinases with available PDB structures and experimental activity data against 1202 small-molecule compounds from the PubChem BioAssay dataset “Navigating the Kinome”. We propose structure-based interaction descriptors to build activity predicting machine learning model. Results and Discussion We report a ligand-oriented computational method for accurate kinase target prioritizing. Our method shows high accuracy compared to similar structure-based activity prediction methods, and more importantly shows the same prediction accuracy when tested on the special set of structurally remote compounds, showing that it is unbiased to ligand structural similarity in the training set data. We hope that our approach will be useful for the development of novel highly selective kinase inhibitors.
Collapse
Affiliation(s)
- Arina Afanasyeva
- Bioinformatics Project, National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Chioko Nagao
- Bioinformatics Project, National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan.,Institute for Protein Research, Osaka University, Osaka, Japan
| | - Kenji Mizuguchi
- Bioinformatics Project, National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan.,Institute for Protein Research, Osaka University, Osaka, Japan
| |
Collapse
|
16
|
Agyemang B, Wu WP, Kpiebaareh MY, Lei Z, Nanor E, Chen L. Multi-view self-attention for interpretable drug–target interaction prediction. J Biomed Inform 2020; 110:103547. [DOI: 10.1016/j.jbi.2020.103547] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 08/21/2020] [Accepted: 08/24/2020] [Indexed: 01/08/2023]
|
17
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
18
|
Incorporating chemical sub-structures and protein evolutionary information for inferring drug-target interactions. Sci Rep 2020; 10:6641. [PMID: 32313024 PMCID: PMC7171114 DOI: 10.1038/s41598-020-62891-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/12/2020] [Indexed: 01/29/2023] Open
Abstract
Accumulating evidence has shown that drug-target interactions (DTIs) play a crucial role in the process of genomic drug discovery. Although biological experimental technology has made great progress, the identification of DTIs is still very time-consuming and expensive nowadays. Hence it is urgent to develop in silico model as a supplement to the biological experiments to predict the potential DTIs. In this work, a new model is designed to predict DTIs by incorporating chemical sub-structures and protein evolutionary information. Specifically, we first use Position-Specific Scoring Matrix (PSSM) to convert the protein sequence into the numerical descriptor containing biological evolutionary information, then use Discrete Cosine Transform (DCT) algorithm to extract the hidden features and integrate them with the chemical sub-structures descriptor, and finally utilize Rotation Forest (RF) classifier to accurately predict whether there is interaction between the drug and the target protein. In the 5-fold cross-validation (CV) experiment, the average accuracy of the proposed model on the benchmark datasets of Enzymes, Ion Channels, GPCRs and Nuclear Receptors reached 0.9140, 0.8919, 0.8724 and 0.8111, respectively. In order to fully evaluate the performance of the proposed model, we compare it with different feature extraction model, classifier model, and other state-of-the-art models. Furthermore, we also implemented case studies. As a result, 8 of the top 10 drug-target pairs with the highest prediction score were confirmed by related databases. These excellent results indicate that the proposed model has outstanding ability in predicting DTIs and can provide reliable candidates for biological experiments.
Collapse
|
19
|
Pliakos K, Vens C. Drug-target interaction prediction with tree-ensemble learning and output space reconstruction. BMC Bioinformatics 2020; 21:49. [PMID: 32033537 PMCID: PMC7006075 DOI: 10.1186/s12859-020-3379-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 01/21/2020] [Indexed: 12/21/2022] Open
Abstract
Background Computational prediction of drug-target interactions (DTI) is vital for drug discovery. The experimental identification of interactions between drugs and target proteins is very onerous. Modern technologies have mitigated the problem, leveraging the development of new drugs. However, drug development remains extremely expensive and time consuming. Therefore, in silico DTI predictions based on machine learning can alleviate the burdensome task of drug development. Many machine learning approaches have been proposed over the years for DTI prediction. Nevertheless, prediction accuracy and efficiency are persisting problems that still need to be tackled. Here, we propose a new learning method which addresses DTI prediction as a multi-output prediction task by learning ensembles of multi-output bi-clustering trees (eBICT) on reconstructed networks. In our setting, the nodes of a DTI network (drugs and proteins) are represented by features (background information). The interactions between the nodes of a DTI network are modeled as an interaction matrix and compose the output space in our problem. The proposed approach integrates background information from both drug and target protein spaces into the same global network framework. Results We performed an empirical evaluation, comparing the proposed approach to state of the art DTI prediction methods and demonstrated the effectiveness of the proposed approach in different prediction settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein networks. We show that output space reconstruction can boost the predictive performance of tree-ensemble learning methods, yielding more accurate DTI predictions. Conclusions We proposed a new DTI prediction method where bi-clustering trees are built on reconstructed networks. Building tree-ensemble learning models with output space reconstruction leads to superior prediction results, while preserving the advantages of tree-ensembles, such as scalability, interpretability and inductive setting.
Collapse
Affiliation(s)
- Konstantinos Pliakos
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium. .,ITEC, imec research group at KU Leuven, Kortrijk, Belgium.
| | - Celine Vens
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium.,ITEC, imec research group at KU Leuven, Kortrijk, Belgium
| |
Collapse
|