201
|
Ding Y, Tang J, Guo F. The Computational Models of Drug-target Interaction Prediction. Protein Pept Lett 2020; 27:348-358. [PMID: 30968771 DOI: 10.2174/0929866526666190410124110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 02/22/2019] [Accepted: 04/02/2019] [Indexed: 12/19/2022]
Abstract
The identification of Drug-Target Interactions (DTIs) is an important process in drug discovery and medical research. However, the tradition experimental methods for DTIs identification are still time consuming, extremely expensive and challenging. In the past ten years, various computational methods have been developed to identify potential DTIs. In this paper, the identification methods of DTIs are summarized. What's more, several state-of-the-art computational methods are mainly introduced, containing network-based method and machine learning-based method. In particular, for machine learning-based methods, including the supervised and semisupervised models, have essential differences in the approach of negative samples. Although these effective computational models in identification of DTIs have achieved significant improvements, network-based and machine learning-based methods have their disadvantages, respectively. These computational methods are evaluated on four benchmark data sets via values of Area Under the Precision Recall curve (AUPR).
Collapse
Affiliation(s)
- Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
202
|
Gao D, Chen Q, Zeng Y, Jiang M, Zhang Y. Applications of Machine Learning in Drug Target Discovery. Curr Drug Metab 2020; 21:790-803. [PMID: 32723266 DOI: 10.2174/1567201817999200728142023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 03/12/2020] [Accepted: 05/13/2020] [Indexed: 12/15/2022]
Abstract
Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.
Collapse
Affiliation(s)
- Dongrui Gao
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Meng Jiang
- School of Mechanical Automotive Engineering, Nanyang Institute of Technology, Nanyang 473000, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| |
Collapse
|
203
|
Wu Z, Lawrence PJ, Ma A, Zhu J, Xu D, Ma Q. Single-Cell Techniques and Deep Learning in Predicting Drug Response. Trends Pharmacol Sci 2020; 41:1050-1065. [PMID: 33153777 PMCID: PMC7669610 DOI: 10.1016/j.tips.2020.10.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 10/04/2020] [Accepted: 10/09/2020] [Indexed: 12/19/2022]
Abstract
Rapidly developing single-cell sequencing analyses produce more comprehensive profiles of the genomic, transcriptomic, and epigenomic heterogeneity of tumor subpopulations than do traditional bulk sequencing analyses. Moreover, single-cell techniques allow the response of a tumor to drug exposure to be more thoroughlyinvestigated. Deep learning (DL) models have successfully extracted features from complex bulk sequence data to predict drug responses. We review recent innovations in single-cell technologies and DL-based approaches related to drug sensitivity predictions. We believe that, by using insights from bulk sequencedata, deep transfer learning (DTL) can facilitate the use of single-cell data for training superior DL-based drug prediction models.
Collapse
Affiliation(s)
- Zhenyu Wu
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Patrick J Lawrence
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Jian Zhu
- Department of Pathology, The Ohio State University, Columbus, OH 43210, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA.
| |
Collapse
|
204
|
Huang L, Luo H, Li S, Wu FX, Wang J. Drug-drug similarity measure and its applications. Brief Bioinform 2020; 22:5956929. [PMID: 33152756 DOI: 10.1093/bib/bbaa265] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 02/01/2023] Open
Abstract
Drug similarities play an important role in modern biology and medicine, as they help scientists gain deep insights into drugs' therapeutic mechanisms and conduct wet labs that may significantly improve the efficiency of drug research and development. Nowadays, a number of drug-related databases have been constructed, with which many methods have been developed for computing similarities between drugs for studying associations between drugs, human diseases, proteins (drug targets) and more. In this review, firstly, we briefly introduce the publicly available drug-related databases. Secondly, based on different drug features, interaction relationships and multimodal data, we summarize similarity calculation methods in details. Then, we discuss the applications of drug similarities in various biological and medical areas. Finally, we evaluate drug similarity calculation methods with common evaluation metrics to illustrate the important roles of drug similarity measures on different applications.
Collapse
Affiliation(s)
- Lan Huang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Huimin Luo
- School of Computer and Information Engineering at Henan University, Kaifeng, China
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
205
|
Li P, Li Y, Hsieh CY, Zhang S, Liu X, Liu H, Song S, Yao X. TrimNet: learning molecular representation from triplet messages for biomedicine. Brief Bioinform 2020; 22:5955940. [PMID: 33147620 DOI: 10.1093/bib/bbaa266] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 09/11/2020] [Accepted: 09/14/2020] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Computational methods accelerate drug discovery and play an important role in biomedicine, such as molecular property prediction and compound-protein interaction (CPI) identification. A key challenge is to learn useful molecular representation. In the early years, molecular properties are mainly calculated by quantum mechanics or predicted by traditional machine learning methods, which requires expert knowledge and is often labor-intensive. Nowadays, graph neural networks have received significant attention because of the powerful ability to learn representation from graph data. Nevertheless, current graph-based methods have some limitations that need to be addressed, such as large-scale parameters and insufficient bond information extraction. RESULTS In this study, we proposed a graph-based approach and employed a novel triplet message mechanism to learn molecular representation efficiently, named triplet message networks (TrimNet). We show that TrimNet can accurately complete multiple molecular representation learning tasks with significant parameter reduction, including the quantum properties, bioactivity, physiology and CPI prediction. In the experiments, TrimNet outperforms the previous state-of-the-art method by a significant margin on various datasets. Besides the few parameters and high prediction accuracy, TrimNet could focus on the atoms essential to the target properties, providing a clear interpretation of the prediction tasks. These advantages have established TrimNet as a powerful and useful computational tool in solving the challenging problem of molecular representation learning. AVAILABILITY The quantum and drug datasets are available on the website of MoleculeNet: http://moleculenet.ai. The source code is available in GitHub: https://github.com/yvquanli/trimnet. CONTACT xjyao@lzu.edu.cn, songsen@tsinghua.edu.cn.
Collapse
Affiliation(s)
- Pengyong Li
- Department of Biomedical Engineering at Tsinghua University
| | - Yuquan Li
- College of Chemistry and Chemical Engineering at Lanzhou University
| | | | | | | | | | | | | |
Collapse
|
206
|
Cong H, Liu H, Chen Y, Cao Y. Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization. Med Biol Eng Comput 2020; 58:3017-3038. [PMID: 33078303 DOI: 10.1007/s11517-020-02275-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 10/14/2020] [Indexed: 12/12/2022]
Abstract
In the present paper, deep convolutional neural network (DCNN) is applied to multilocus protein subcellular localization as it is more suitable for multi-class classification. There are two main problems with this application. First, the appropriate features for correlation between multiple sites are hard to find. Second, the classifier structure is difficult to determine as it is greatly affected by the distribution of classified data. To solve these problems, a self-evoluting framework using DCNNs for multilocus protein subcellular localization is proposed. It has three characteristics that the previous algorithms do not. The first is that it combines the ant colony algorithm with the DCNN to form a self-evoluting algorithm for multilocus protein subcellular localization. The second is that it randomly groups subcellular sites using a limited random k-labelsets multi-label classification method. It also solves complex problems in a divide-and-conquer approach and proposes a flexible expansion model. The third is that it realizes the random selection feature extraction method in the positioning process and avoids the defects in individual feature extraction methods. The algorithm in the present paper is tested on the human database, and the overall correct rate is 67.17%, which is higher than that for the stacked self-encoder (SAE), support vector machine (SVM), random forest classifier (RF), or single deep convolutional neural network.Graphical abstract The algorithm mentioned in the present paper mainly includes four parts. They are protein sequence data preprocessing, integrated DCNN model construction, finding optimal DCNN combination by ant colony optimization, and protein subcellular localization for sequences. These parts are sequential relationships and the data obtained in the previous part is the basis for the latter part of the function. In the part of data preprocessing, the limited RAkEL multi-label classification method is used to randomly group subcellular sites. At the same time, the feature fusion of protein sequences is carried out by using multiple feature extraction methods. Each combination including features and sites information corresponds to a DCNN model. In the part of finding optimal DCNN combination by ant colony optimization, the main purpose is to find the best combination of DCNN models through the global optimization ability of the ant colony algorithm. The positioning of sequences is mainly to obtain multilocus subcellular localization by the optimal model combination.
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, No. 88, Wenhua East Road, Jinan City, China.,Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, No. 88, Wenhua East Road, Jinan City, China. .,Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China.
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China.,Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China.,Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China
| |
Collapse
|
207
|
Zhiming C, Daming L, Lianbing D. Risk evaluation of urban rainwater system waterlogging based on neural network and dynamic hydraulic model. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-189045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
With the rapid development of urban construction and the further improvement of the degree of urbanization, despite the intensification of the drainage system construction, the problem of urban waterlogging is still showing an increasingly significant trend. In this paper, the authors analyze the risk evaluation of urban rainwater system waterlogging based on neural network and dynamic hydraulic model. This article introduces the concept of risk into the study of urban waterlogging problems, combines advanced computer simulation methods to simulate different conditions of rainwater systems, and conducts urban waterlogging risk assessment. Because the phenomenon of urban waterlogging is vague, it is affected by a variety of factors and requires comprehensive evaluation. Therefore, the fuzzy comprehensive evaluation method is very suitable for solving the risk evaluation problem of urban waterlogging. In order to improve the scientificity of drainage and waterlogging prevention planning, sponge cities should gradually establish rainwater impact assessment and waterlogging risk evaluation systems, comprehensively evaluate the current capacity of urban drainage and waterlogging prevention facilities and waterlogging risks, draw a map of urban rainwater and waterlogging risks, and determine the risk level. At the same time, delineate drainage and waterlogging prevention zones and risk management zones to provide effective technical support for the formulation of drainage and storm waterlogging prevention plans and emergency management.
Collapse
Affiliation(s)
- Cai Zhiming
- Institute of Data Science, City University of Macau, China
| | - Li Daming
- Institute of Data Science, City University of Macau, China
- The Post-Doctoral Research Center of Zhuhai Da Hengqin Science and Technology Development Co., Ltd, China
| | - Deng Lianbing
- Zhuhai Da Hengqin Science and Technology Development Co., Ltd, Hengqin New Area, China
| |
Collapse
|
208
|
He S, Wen Y, Yang X, Liu Z, Song X, Huang X, Bo X. PIMD: An Integrative Approach for Drug Repositioning Using Multiple Characterization Fusion. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:565-581. [PMID: 33075523 PMCID: PMC8377380 DOI: 10.1016/j.gpb.2018.10.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Revised: 09/21/2018] [Accepted: 10/10/2018] [Indexed: 11/28/2022]
Abstract
The accumulation of various types of drug informatics data and computational approaches for drug repositioning can accelerate pharmaceutical research and development. However, the integration of multi-dimensional drug data for precision repositioning remains a pressing challenge. Here, we propose a systematic framework named PIMD to predict drug therapeutic properties by integrating multi-dimensional data for drug repositioning. In PIMD, drug similarity networks (DSNs) based on chemical, pharmacological, and clinical data are fused into an integrated DSN (iDSN) composed of many clusters. Rather than simple fusion, PIMD offers a systematic way to annotate clusters. Unexpected drugs within clusters and drug pairs with a high iDSN similarity score are therefore identified to predict novel therapeutic uses. PIMD provides new insights into the universality, individuality, and complementarity of different drug properties by evaluating the contribution of each property data. To test the performance of PIMD, we use chemical, pharmacological, and clinical properties to generate an iDSN. Analyses of the contributions of each drug property indicate that this iDSN was driven by all data types and performs better than other DSNs. Within the top 20 recommended drug pairs, 7 drugs have been reported to be repurposed. The source code for PIMD is available at https://github.com/Sepstar/PIMD/.
Collapse
Affiliation(s)
- Song He
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaoxi Yang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Zhen Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xinyu Song
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xin Huang
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, China.
| |
Collapse
|
209
|
Yang S, Ye Q, Ding J, Yin, Lu A, Chen X, Hou T, Cao D. Current advances in ligand‐based target prediction. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Su‐Qing Yang
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
| | - Qing Ye
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Jun‐Jie Ding
- Beijing Institute of Pharmaceutical Chemistry Beijing China
| | - Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ai‐Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ting‐Jun Hou
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Dong‐Sheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| |
Collapse
|
210
|
Hasan Mahmud SM, Chen W, Jahan H, Dai B, Din SU, Dzisoo AM. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Anal Biochem 2020; 610:113978. [PMID: 33035462 DOI: 10.1016/j.ab.2020.113978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 09/23/2020] [Accepted: 09/25/2020] [Indexed: 12/13/2022]
Abstract
Drug-target interactions (DTIs) play a key role in drug development and discovery processes. Wet lab prediction of DTIs is time-consuming, expensive, and tedious. Fortunately, computational approaches can identify new interactions (drug-target pairs) and accelerate the process of drug repurposing. However, a vast number of interactions remain undiscovered; therefore, we proposed a deep learning-based method (deepACTION) for predicting potential or unknown DTIs. Here, each drug chemical structure and protein sequence are transformed according to structural and sequence information using different descriptors to represent their features correctly. There have been some challenges, such as the high dimensionality and class imbalance of data during the prediction process. To address these problems, we developed the MMIB technique to balance the majority and minority instances in the dataset and utilized a LASSO model to handle the high dimensionality of the data. In addition, we trained the convolutional neural network algorithm with balanced and reduced features for accurate prediction of DTIs. In this study, the AUC is considered a primary evaluation metric for comparing the performance of the deep ACTION model with that of existing methods by a 5-fold cross-validation test. Our experiential dataset obtained from the DrugBank database and our deepACTION model achieved an AUC of 0.9836 for this dataset. The experimental results ensured that the model can predict significant numbers of new DTIs and provide complete information to motivate scientists to develop drugs.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Hosney Jahan
- College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Bo Dai
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Salah Ud Din
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Anthony Mackitz Dzisoo
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| |
Collapse
|
211
|
Agyemang B, Wu WP, Kpiebaareh MY, Lei Z, Nanor E, Chen L. Multi-view self-attention for interpretable drug–target interaction prediction. J Biomed Inform 2020; 110:103547. [DOI: 10.1016/j.jbi.2020.103547] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 08/21/2020] [Accepted: 08/24/2020] [Indexed: 01/08/2023]
|
212
|
A multimodal deep learning-based drug repurposing approach for treatment of COVID-19. Mol Divers 2020; 25:1717-1730. [PMID: 32997257 PMCID: PMC7525234 DOI: 10.1007/s11030-020-10144-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Accepted: 09/12/2020] [Indexed: 12/12/2022]
Abstract
Abstract Recently, various computational methods have been proposed to find new therapeutic applications of the existing drugs. The Multimodal Restricted Boltzmann Machine approach (MM-RBM), which has the capability to connect the information about the multiple modalities, can be applied to the problem of drug repurposing. The present study utilized MM-RBM to combine two types of data, including the chemical structures data of small molecules and differentially expressed genes as well as small molecules perturbations. In the proposed method, two separate RBMs were applied to find out the features and the specific probability distribution of each datum (modality). Besides, RBM was used to integrate the discovered features, resulting in the identification of the probability distribution of the combined data. The results demonstrated the significance of the clusters acquired by our model. These clusters were used to discover the medicines which were remarkably similar to the proposed medications to treat COVID-19. Moreover, the chemical structures of some small molecules as well as dysregulated genes’ effect led us to suggest using these molecules to treat COVID-19. The results also showed that the proposed method might prove useful in detecting the highly promising remedies for COVID-19 with minimum side effects. All the source codes are accessible using https://github.com/LBBSoft/Multimodal-Drug-Repurposing.git Graphic abstract ![]()
Electronic supplementary material The online version of this article (10.1007/s11030-020-10144-9) contains supplementary material, which is available to authorized users.
Collapse
|
213
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
214
|
Low ZY, Farouk IA, Lal SK. Drug Repositioning: New Approaches and Future Prospects for Life-Debilitating Diseases and the COVID-19 Pandemic Outbreak. Viruses 2020; 12:E1058. [PMID: 32972027 PMCID: PMC7551028 DOI: 10.3390/v12091058] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 08/02/2020] [Accepted: 08/21/2020] [Indexed: 02/06/2023] Open
Abstract
Traditionally, drug discovery utilises a de novo design approach, which requires high cost and many years of drug development before it reaches the market. Novel drug development does not always account for orphan diseases, which have low demand and hence low-profit margins for drug developers. Recently, drug repositioning has gained recognition as an alternative approach that explores new avenues for pre-existing commercially approved or rejected drugs to treat diseases aside from the intended ones. Drug repositioning results in lower overall developmental expenses and risk assessments, as the efficacy and safety of the original drug have already been well accessed and approved by regulatory authorities. The greatest advantage of drug repositioning is that it breathes new life into the novel, rare, orphan, and resistant diseases, such as Cushing's syndrome, HIV infection, and pandemic outbreaks such as COVID-19. Repositioning existing drugs such as Hydroxychloroquine, Remdesivir, Ivermectin and Baricitinib shows good potential for COVID-19 treatment. This can crucially aid in resolving outbreaks in urgent times of need. This review discusses the past success in drug repositioning, the current technological advancement in the field, drug repositioning for personalised medicine and the ongoing research on newly emerging drugs under consideration for the COVID-19 treatment.
Collapse
Affiliation(s)
- Zheng Yao Low
- School of Science, Monash University, Bandar Sunway, Subang Jaya 47500, Selangor Darul Ehsan, Malaysia; (Z.Y.L.); (I.A.F.)
| | - Isra Ahmad Farouk
- School of Science, Monash University, Bandar Sunway, Subang Jaya 47500, Selangor Darul Ehsan, Malaysia; (Z.Y.L.); (I.A.F.)
| | - Sunil Kumar Lal
- School of Science, Monash University, Bandar Sunway, Subang Jaya 47500, Selangor Darul Ehsan, Malaysia; (Z.Y.L.); (I.A.F.)
- Tropical Medicine & Biology Platform, Monash University, Subang Jaya 47500, Selangor Darul Ehsan, Malaysia
| |
Collapse
|
215
|
Peng J, Li J, Shang X. A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network. BMC Bioinformatics 2020; 21:394. [PMID: 32938374 PMCID: PMC7495825 DOI: 10.1186/s12859-020-03677-1] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Drug-target interaction prediction is of great significance for narrowing down the scope of candidate medications, and thus is a vital step in drug discovery. Because of the particularity of biochemical experiments, the development of new drugs is not only costly, but also time-consuming. Therefore, the computational prediction of drug target interactions has become an essential way in the process of drug discovery, aiming to greatly reducing the experimental cost and time. RESULTS We propose a learning-based method based on feature representation learning and deep neural network named DTI-CNN to predict the drug-target interactions. We first extract the relevant features of drugs and proteins from heterogeneous networks by using the Jaccard similarity coefficient and restart random walk model. Then, we adopt a denoising autoencoder model to reduce the dimension and identify the essential features. Third, based on the features obtained from last step, we constructed a convolutional neural network model to predict the interaction between drugs and proteins. The evaluation results show that the average AUROC score and AUPR score of DTI-CNN were 0.9416 and 0.9499, which obtains better performance than the other three existing state-of-the-art methods. CONCLUSIONS All the experimental results show that the performance of DTI-CNN is better than that of the three existing methods and the proposed method is appropriately designed.
Collapse
Affiliation(s)
- Jiajie Peng
- The School of Computer Science, Northwestern Polytechnical University, Xian, 710072, China.,The Key Laboratory of Big Data Storage an Management, Northwestern Polytechnical Universitythe, Ministry of Industry and Information Technology, Xian, 710072, China
| | - Jingyi Li
- The School of Computer Science, Northwestern Polytechnical University, Xian, 710072, China.,The Key Laboratory of Big Data Storage an Management, Northwestern Polytechnical Universitythe, Ministry of Industry and Information Technology, Xian, 710072, China
| | - Xuequn Shang
- The School of Computer Science, Northwestern Polytechnical University, Xian, 710072, China. .,The Key Laboratory of Big Data Storage an Management, Northwestern Polytechnical Universitythe, Ministry of Industry and Information Technology, Xian, 710072, China.
| |
Collapse
|
216
|
Ji BY, You ZH, Jiang HJ, Guo ZH, Zheng K. Prediction of drug-target interactions from multi-molecular network based on LINE network representation method. J Transl Med 2020; 18:347. [PMID: 32894154 PMCID: PMC7487884 DOI: 10.1186/s12967-020-02490-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 08/20/2020] [Indexed: 12/28/2022] Open
Abstract
Background The prediction of potential drug-target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. Methods In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous multi-molecuar information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. Results In the results, under the five-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. Conclusions In short, these results indicate that our method can be a powerful tool for predicting potential drug-target interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.
Collapse
Affiliation(s)
- Bo-Ya Ji
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Han-Jing Jiang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhen-Hao Guo
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Kai Zheng
- School of Computer Science and Engineering, Cen-tral South University, Changsha, 410083, China
| |
Collapse
|
217
|
Zhao L, Ciallella HL, Aleksunes LM, Zhu H. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 2020; 25:1624-1638. [PMID: 32663517 PMCID: PMC7572559 DOI: 10.1016/j.drudis.2020.07.005] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023]
Abstract
Advancing a new drug to market requires substantial investments in time as well as financial resources. Crucial bioactivities for drug candidates, including their efficacy, pharmacokinetics (PK), and adverse effects, need to be investigated during drug development. With advancements in chemical synthesis and biological screening technologies over the past decade, a large amount of biological data points for millions of small molecules have been generated and are stored in various databases. These accumulated data, combined with new machine learning (ML) approaches, such as deep learning, have shown great potential to provide insights into relevant chemical structures to predict in vitro, in vivo, and clinical outcomes, thereby advancing drug discovery and development in the big data era.
Collapse
Affiliation(s)
- Linlin Zhao
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Heather L Ciallella
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854, USA
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA; Department of Chemistry, Rutgers University, Camden, NJ 08102, USA.
| |
Collapse
|
218
|
Mervin LH, Afzal AM, Engkvist O, Bender A. Comparison of Scaling Methods to Obtain Calibrated Probabilities of Activity for Protein–Ligand Predictions. J Chem Inf Model 2020; 60:4546-4559. [DOI: 10.1021/acs.jcim.0c00476] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Lewis H. Mervin
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Cambridge CB2 0AA, U.K
| | - Avid M. Afzal
- Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB2 0AA, U.K
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Mölndal SE-431 83, Sweden
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge CB2 1TN, U.K
| |
Collapse
|
219
|
Liang S, Yu H. Revealing new therapeutic opportunities through drug target prediction: a class imbalance-tolerant machine learning approach. Bioinformatics 2020; 36:4490-4497. [PMID: 32399556 PMCID: PMC7750999 DOI: 10.1093/bioinformatics/btaa495] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 02/18/2020] [Accepted: 05/06/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION In silico drug target prediction provides valuable information for drug repurposing, understanding of side effects as well as expansion of the druggable genome. In particular, discovery of actionable drug targets is critical to developing targeted therapies for diseases. RESULTS Here, we develop a robust method for drug target prediction by leveraging a class imbalance-tolerant machine learning framework with a novel training scheme. We incorporate novel features, including drug-gene phenotype similarity and gene expression profile similarity that capture information orthogonal to other features. We show that our classifier achieves robust performance and is able to predict gene targets for new drugs as well as drugs that potentially target unexplored genes. By providing newly predicted drug-target associations, we uncover novel opportunities of drug repurposing that may benefit cancer treatment through action on either known drug targets or currently undrugged genes. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
220
|
Wang C, Wang W, Lu K, Zhang J, Chen P, Wang B. Predicting Drug-Target Interactions with Electrotopological State Fingerprints and Amphiphilic Pseudo Amino Acid Composition. Int J Mol Sci 2020; 21:ijms21165694. [PMID: 32784497 PMCID: PMC7570185 DOI: 10.3390/ijms21165694] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 12/13/2022] Open
Abstract
The task of drug-target interaction (DTI) prediction plays important roles in drug development. The experimental methods in DTIs are time-consuming, expensive and challenging. To solve these problems, machine learning-based methods are introduced, which are restricted by effective feature extraction and negative sampling. In this work, features with electrotopological state (E-state) fingerprints for drugs and amphiphilic pseudo amino acid composition (APAAC) for target proteins are tested. E-state fingerprints are extracted based on both molecular electronic and topological features with the same metric. APAAC is an extension of amino acid composition (AAC), which is calculated based on hydrophilic and hydrophobic characters to construct sequence order information. Using the combination of these feature pairs, the prediction model is established by support vector machines. In order to enhance the effectiveness of features, a distance-based negative sampling is proposed to obtain reliable negative samples. It is shown that the prediction results of area under curve for Receiver Operating Characteristic (AUC) are above 98.5% for all the three datasets in this work. The comparison of state-of-the-art methods demonstrates the effectiveness and efficiency of proposed method, which will be helpful for further drug development.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Computer Science & Technology, Tongji University, Shanghai 201804, China;
| | - Wenyan Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
- Key Laboratory of Power Electronics and Motion Control Anhui Education Department, Ma’anshan 243032, China
| | - Kun Lu
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
| | - Jun Zhang
- Institutes of Physical Science and Information Technology & School of Internet, Anhui University, Hefei 230601, China;
| | - Peng Chen
- Institutes of Physical Science and Information Technology & School of Internet, Anhui University, Hefei 230601, China;
- Correspondence: (P.C.); (B.W.)
| | - Bing Wang
- Department of Computer Science & Technology, Tongji University, Shanghai 201804, China;
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
- Key Laboratory of Power Electronics and Motion Control Anhui Education Department, Ma’anshan 243032, China
- Correspondence: (P.C.); (B.W.)
| |
Collapse
|
221
|
Abstract
The current global pandemic COVID-19 caused by the SARS-CoV-2 virus has already inflicted insurmountable damage both to the human lives and global economy. There is an immediate need for identification of effective drugs to contain the disastrous virus outbreak. Global efforts are already underway at a war footing to identify the best drug combination to address the disease. In this review, an attempt has been made to understand the SARS-CoV-2 life cycle, and based on this information potential druggable targets against SARS-CoV-2 are summarized. Also, the strategies for ongoing and future drug discovery against the SARS-CoV-2 virus are outlined. Given the urgency to find a definitive cure, ongoing drug repurposing efforts being carried out by various organizations are also described. The unprecedented crisis requires extraordinary efforts from the scientific community to effectively address the issue and prevent further loss of human lives and health.
Collapse
Affiliation(s)
- Ambrish Saxena
- Indian Institute of Technology Tirupati, Tirupati, India
| |
Collapse
|
222
|
Wang W, Lv H, Zhao Y. Predicting DNA binding protein-drug interactions based on network similarity. BMC Bioinformatics 2020; 21:322. [PMID: 32689927 PMCID: PMC7372772 DOI: 10.1186/s12859-020-03664-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 07/15/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The study of DNA binding protein (DBP)-drug interactions can open a breakthrough for the treatment of genetic diseases and cancers. Currently, network-based methods are widely used for protein-drug interaction prediction, and many hidden relationships can be found through network analysis. We proposed a DCA (drug-cluster association) model for predicting DBP-drug interactions. The clusters are some similarities in the drug-binding site trimmers with their physicochemical properties. First, DBPs-drug binding sites are extracted from scPDB database. Second, each binding site is represented as a trimer which is obtained by sliding the window in the binding sites. Third, the trimers are clustered based on the physicochemical properties. Fourth, we build the network by generating the interaction matrix for representing the DCA network. Fifth, three link prediction methods are detected in the network. Finally, the common neighbor (CN) method is selected to predict drug-cluster associations in the DBP-drug network model. RESULT This network shows that drugs tend to bind to positively charged sites and the binding process is more likely to occur inside the DBPs. The results of the link prediction indicate that the CN method has better prediction performance than the PA and JA methods. The DBP-drug network prediction model is generated by using the CN method which predicted more accurately drug-trimer interactions and DBP-drug interactions. Such as, we found that Erythromycin (ERY) can establish an interaction relationship with HTH-type transcriptional repressor, which is fitted well with silico DBP-drug prediction. CONCLUSION The drug and protein bindings are local events. The binding of the drug-DBPs binding site represents this local binding event, which helps to understand the mechanism of DBP-drug interactions.
Collapse
Affiliation(s)
- Wei Wang
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China. .,Big Data Engineering Laboratory for Teaching Resources & assessment of Education Quality, Henan Province, Xinxiang, China.
| | - Hehe Lv
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China
| | - Yuan Zhao
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, 453007, China
| |
Collapse
|
223
|
Jeon M, Park D, Lee J, Jeon H, Ko M, Kim S, Choi Y, Tan AC, Kang J. ReSimNet: drug response similarity prediction using Siamese neural networks. Bioinformatics 2020; 35:5249-5256. [PMID: 31116384 DOI: 10.1093/bioinformatics/btz411] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Revised: 04/02/2019] [Accepted: 05/16/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Traditional drug discovery approaches identify a target for a disease and find a compound that binds to the target. In this approach, structures of compounds are considered as the most important features because it is assumed that similar structures will bind to the same target. Therefore, structural analogs of the drugs that bind to the target are selected as drug candidates. However, even though compounds are not structural analogs, they may achieve the desired response. A new drug discovery method based on drug response, which can complement the structure-based methods, is needed. RESULTS We implemented Siamese neural networks called ReSimNet that take as input two chemical compounds and predicts the CMap score of the two compounds, which we use to measure the transcriptional response similarity of the two compounds. ReSimNet learns the embedding vector of a chemical compound in a transcriptional response space. ReSimNet is trained to minimize the difference between the cosine similarity of the embedding vectors of the two compounds and the CMap score of the two compounds. ReSimNet can find pairs of compounds that are similar in response even though they may have dissimilar structures. In our quantitative evaluation, ReSimNet outperformed the baseline machine learning models. The ReSimNet ensemble model achieves a Pearson correlation of 0.518 and a precision@1% of 0.989. In addition, in the qualitative analysis, we tested ReSimNet on the ZINC15 database and showed that ReSimNet successfully identifies chemical compounds that are relevant to a prototype drug whose mechanism of action is known. AVAILABILITY AND IMPLEMENTATION The source code and the pre-trained weights of ReSimNet are available at https://github.com/dmis-lab/ReSimNet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Minji Jeon
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Donghyeon Park
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Jinhyuk Lee
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Hwisang Jeon
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, South Korea
| | - Miyoung Ko
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Sunkyu Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Yonghwa Choi
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea
| | - Aik-Choon Tan
- Division of Medical Oncology, Department of Medicine, Translational Bioinformatics and Cancer Systems Biology Laboratory, University of Colorado Anschutz Medical Campus, Aurora, CO 12801, USA
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea.,Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, South Korea
| |
Collapse
|
224
|
Eslami Manoochehri H, Nourani M. Drug-target interaction prediction using semi-bipartite graph model and deep learning. BMC Bioinformatics 2020; 21:248. [PMID: 32631230 PMCID: PMC7336396 DOI: 10.1186/s12859-020-3518-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Identifying drug-target interaction is a key element in drug discovery. In silico prediction of drug-target interaction can speed up the process of identifying unknown interactions between drugs and target proteins. In recent studies, handcrafted features, similarity metrics and machine learning methods have been proposed for predicting drug-target interactions. However, these methods cannot fully learn the underlying relations between drugs and targets. In this paper, we propose anew framework for drug-target interaction prediction that learns latent features from drug-target interaction network. RESULTS We present a framework to utilize the network topology and identify interacting and non-interacting drug-target pairs. We model the problem as a semi-bipartite graph in which we are able to use drug-drug and protein-protein similarity in a drug-protein network. We have then used a graph labeling method for vertex ordering in our graph embedding process. Finally, we employed deep neural network to learn the complex pattern of interacting pairs from embedded graphs. We show our approach is able to learn sophisticated drug-target topological features and outperforms other state-of-the-art approaches. CONCLUSIONS The proposed learning model on semi-bipartite graph model, can integrate drug-drug and protein-protein similarities which are semantically different than drug-protein information in a drug-target interaction network. We show our model can determine interaction likelihood for each drug-target pair and outperform other heuristics.
Collapse
Affiliation(s)
- Hafez Eslami Manoochehri
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA
| | - Mehrdad Nourani
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA.
| |
Collapse
|
225
|
Liu Z, Chen Q, Lan W, Liang J, Chen YPP, Chen B. A Survey of Network Embedding for Drug Analysis and Prediction. Curr Protein Pept Sci 2020; 22:CPPS-EPUB-107859. [PMID: 32614745 DOI: 10.2174/1389203721666200702145701] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 04/05/2020] [Accepted: 05/21/2020] [Indexed: 11/22/2022]
Abstract
Traditional network-based computational methods have shown good results in drug analysis and prediction. However, these methods are time consuming and lack universality, and it is difficult to exploit the auxiliary information of nodes and edges. Network embedding provides a promising way for alleviating the above problems by transforming network into a low-dimensional space while preserving network structure and auxiliary information. This thus facilitates the application of machine learning algorithms for subsequent processing. Network embedding has been introduced into drug analysis and prediction in the last few years, and has shown superior performance over traditional methods. However, there is no systematic review of this issue. This article offers a comprehensive survey of the primary network embedding methods and their applications in drug analysis and prediction. The network embedding technologies applied in homogeneous network and heterogeneous network are investigated and compared, including matrix decomposition, random walk, and deep learning. Especially, the Graph neural network (GNN) methods in deep learning are highlighted. Further, the applications of network embedding in drug similarity estimation, drug-target interaction prediction, adverse drug reactions prediction, protein function and therapeutic peptides prediction are discussed. Several future potential research directions are also discussed.
Collapse
Affiliation(s)
- Zhixian Liu
- School of Medical, Guangxi University, Nanning. China
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning. China
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning. China
| | - Jiahai Liang
- School of Electronics and Information Engineering, Beibu Gulf University, Qinzhou. China
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne. Australia
| | - Baoshan Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, Nanning. China
| |
Collapse
|
226
|
Gong J, Chen Y, Pu F, Sun P, He F, Zhang L, Li Y, Ma Z, Wang H. Understanding Membrane Protein Drug Targets in Computational Perspective. Curr Drug Targets 2020; 20:551-564. [PMID: 30516106 DOI: 10.2174/1389450120666181204164721] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 09/03/2018] [Accepted: 09/04/2018] [Indexed: 01/16/2023]
Abstract
Membrane proteins play crucial physiological roles in vivo and are the major category of drug targets for pharmaceuticals. The research on membrane protein is a significant part in the drug discovery. The biological process is a cycled network, and the membrane protein is a vital hub in the network since most drugs achieve the therapeutic effect via interacting with the membrane protein. In this review, typical membrane protein targets are described, including GPCRs, transporters and ion channels. Also, we conclude network servers and databases that are referring to the drug, drug-target information and their relevant data. Furthermore, we chiefly introduce the development and practice of modern medicines, particularly demonstrating a series of state-of-the-art computational models for the prediction of drug-target interaction containing network-based approach and machine-learningbased approach as well as showing current achievements. Finally, we discuss the prospective orientation of drug repurposing and drug discovery as well as propose some improved framework in bioactivity data, created or improved predicted approaches, alternative understanding approaches of drugs bioactivity and their biological processes.
Collapse
Affiliation(s)
- Jianting Gong
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Yongbing Chen
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Feng Pu
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Pingping Sun
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Li Zhang
- School of Computer Science and Engineering, Changchun University of Technology, Changchun, China
| | - Yanwen Li
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| | - Han Wang
- School of Information Science and Technology, Northeast Normal University, Changchun, China.,Institution of Computational Biology, Northeast Normal University, Changchun, China
| |
Collapse
|
227
|
Jiang M, Li Z, Zhang S, Wang S, Wang X, Yuan Q, Wei Z. Drug-target affinity prediction using graph neural network and contact maps. RSC Adv 2020; 10:20701-20712. [PMID: 35517730 PMCID: PMC9054320 DOI: 10.1039/d0ra02297g] [Citation(s) in RCA: 144] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/07/2020] [Indexed: 02/01/2023] Open
Abstract
Computer-aided drug design uses high-performance computers to simulate the tasks in drug design, which is a promising research area. Drug-target affinity (DTA) prediction is the most important step of computer-aided drug design, which could speed up drug development and reduce resource consumption. With the development of deep learning, the introduction of deep learning to DTA prediction and improving the accuracy have become a focus of research. In this paper, utilizing the structural information of molecules and proteins, two graphs of drug molecules and proteins are built up respectively. Graph neural networks are introduced to obtain their representations, and a method called DGraphDTA is proposed for DTA prediction. Specifically, the protein graph is constructed based on the contact map output from the prediction method, which could predict the structural characteristics of the protein according to its sequence. It can be seen from the test of various metrics on benchmark datasets that the method proposed in this paper has strong robustness and generalizability.
Collapse
Affiliation(s)
- Mingjian Jiang
- Department of Computer Science and Technology, Ocean University of China China
| | - Zhen Li
- Department of Computer Science and Technology, Ocean University of China China
| | - Shugang Zhang
- Department of Computer Science and Technology, Ocean University of China China
| | - Shuang Wang
- Department of Computer Science and Technology, Ocean University of China China
| | - Xiaofeng Wang
- Department of Computer Science and Technology, Ocean University of China China
| | - Qing Yuan
- Department of Computer Science and Technology, Ocean University of China China
| | - Zhiqiang Wei
- Department of Computer Science and Technology, Ocean University of China China
| |
Collapse
|
228
|
Russell LE, Schwarz UI. Variant discovery using next-generation sequencing and its future role in pharmacogenetics. Pharmacogenomics 2020; 21:471-486. [DOI: 10.2217/pgs-2019-0190] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Next-generation sequencing (NGS) has enabled the discovery of a multitude of novel and mostly rare variants in pharmacogenes that may alter a patient’s therapeutic response to drugs. In addition to single nucleotide variants, structural variation affecting the number of copies of whole genes or parts of genes can be detected. While current guidelines concerning clinical implementation mostly act upon well-documented, common single nucleotide variants to guide dosing or drug selection, in silico and large-scale functional assessment of rare variant effects on protein function are at the forefront of pharmacogenetic research to facilitate their clinical integration. Here, we discuss the role of NGS in variant discovery, paving the way for more comprehensive genotype-guided pharmacotherapy that can translate to improved clinical care.
Collapse
Affiliation(s)
- Laura E Russell
- Department of Physiology & Pharmacology, Western University, Medical Sciences Building, London, ON, N6A 5C1, Canada
| | - Ute I Schwarz
- Department of Physiology & Pharmacology, Western University, Medical Sciences Building, London, ON, N6A 5C1, Canada
- Division of Clinical Pharmacology, Department of Medicine, Western University, London Health Sciences Centre – University Hospital, 339 Windermere Road, London, ON, N6A 5A5, Canada
| |
Collapse
|
229
|
Liu H, Zhang W, Song Y, Deng L, Zhou S. HNet-DNN: Inferring New Drug-Disease Associations with Deep Neural Network Based on Heterogeneous Network Features. J Chem Inf Model 2020; 60:2367-2376. [PMID: 32118415 DOI: 10.1021/acs.jcim.9b01008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Drug research and development is a time-consuming and high-cost task, pressing an urgent demand to identify novel indications of approved drugs, referred to as drug repositioning, which provides an economical and efficient way for drug discovery. With increasing volumes of large-scale chemical, genomic, and pharmacological data sets generated by the high-throughput technique, it is crucial to develop systematic and rational computational approaches to identify new indications of approved drugs. In this paper, we introduce HNet-DNN, which utilizes a deep neural network (DNN), to predict new drug-disease associations based on the features extracted from the drug-disease heterogeneous network. Instead of the straightforward concatenation of chemical and phenotypic features as the input of DNN, we used these raw features of drugs and diseases to construct a drug-drug similarity network and a disease-disease similarity network, and then built a drug-disease heterogeneous network by integrating known drug-disease associations. Subsequently, we extracted topological features for drug-disease associations from the heterogeneous network and used them to train a DNN model. Our intensive performance evaluations demonstrated that HNet-DNN effectively exploits the features of the heterogeneous network to boost the predictive performance of drug-disease associations. Compared with a couple of typical classifiers and competitive approaches, our method not only achieved state-of-the-art performance but also effectively alleviated the overfitting problem. Moreover, we ran HNet-DNN to predict new drug-disease associations and carried out case studies to verify the effectiveness of our method.
Collapse
Affiliation(s)
- Hui Liu
- Aliyun School of Big Data, Changzhou University, 213164 Changzhou, China
| | - Wenhao Zhang
- Aliyun School of Big Data, Changzhou University, 213164 Changzhou, China
| | - Yinglong Song
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, 200433 Shanghai, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, 410075 Changsha, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, 200433 Shanghai, China
| |
Collapse
|
230
|
Kaushik AC, Mehmood A, Dai X, Wei DQ. A comparative chemogenic analysis for predicting Drug-Target Pair via Machine Learning Approaches. Sci Rep 2020; 10:6870. [PMID: 32322011 PMCID: PMC7176722 DOI: 10.1038/s41598-020-63842-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 04/04/2020] [Indexed: 12/26/2022] Open
Abstract
A computational technique for predicting the DTIs has now turned out to be an indispensable job during the process of drug finding. It tapers the exploration room for interactions by propounding possible interaction contenders for authentication through experiments of wet-lab which are known for their expensiveness and time consumption. Chemogenomics, an emerging research area focused on the systematic examination of the biological impact of a broad series of minute molecular-weighting ligands on a broad raiment of macromolecular target spots. Additionally, with the advancement in time, the complexity of the algorithms is increasing which may result in the entry of big data technologies like Spark in this field soon. In the presented work, we intend to offer an inclusive idea and realistic evaluation of the computational Drug Target Interaction projection approaches, to perform as a guide and reference for researchers who are carrying out work in a similar direction. Precisely, we first explain the data utilized in computational Drug Target Interaction prediction attempts like this. We then sort and explain the best and most modern techniques for the prediction of DTIs. Then, a realistic assessment is executed to show the projection performance of several illustrative approaches in various situations. Ultimately, we underline possible opportunities for additional improvement of Drug Target Interaction projection enactment and also linked study objectives.
Collapse
Affiliation(s)
- Aman Chandra Kaushik
- Wuxi School of Medicine, Jiangnan University, Wuxi, China.
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| | - Aamir Mehmood
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Xiaofeng Dai
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
231
|
Rayhan F, Ahmed S, Mousavian Z, Farid DM, Shatabda S. FRnet-DTI: Deep convolutional neural network for drug-target interaction prediction. Heliyon 2020; 6:e03444. [PMID: 32154410 PMCID: PMC7052404 DOI: 10.1016/j.heliyon.2020.e03444] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 06/16/2019] [Accepted: 02/14/2020] [Indexed: 01/09/2023] Open
Abstract
The task of drug-target interaction prediction holds significant importance in pharmacology and therapeutic drug design. In this paper, we present FRnet-DTI, an auto-encoder based feature manipulation and a convolutional neural network based classifier for drug target interaction prediction. Two convolutional neural networks are proposed: FRnet-Encode and FRnet-Predict. Here, one model is used for feature manipulation and the other one for classification. Using the first method FRnet-Encode, we generate 4096 features for each of the instances in each of the datasets and use the second method, FRnet-Predict, to identify interaction probability employing those features. We have tested our method on four gold standard datasets extensively used by other researchers. Experimental results shows that our method significantly improves over the state-of-the-art method on three out of four drug-target interaction gold standard datasets on both area under curve for Receiver Operating Characteristic (auROC) and area under Precision Recall curve (auPR) metric. We also introduce twenty new potential drug-target pairs for interaction based on high prediction scores. The source codes and implementation details of our methods are available from https://github.com/farshidrayhanuiu/FRnet-DTI/ and also readily available to use as an web application from http://farshidrayhan.pythonanywhere.com/FRnet-DTI/.
Collapse
Affiliation(s)
- Farshid Rayhan
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Sajid Ahmed
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Zaynab Mousavian
- School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Dewan Md Farid
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| |
Collapse
|
232
|
Zheng S, Li Y, Chen S, Xu J, Yang Y. Predicting drug–protein interaction using quasi-visual question
answering system. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-0152-y] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
233
|
Ellingson SR, Davis B, Allen J. Machine learning and ligand binding predictions: A review of data, methods, and obstacles. Biochim Biophys Acta Gen Subj 2020; 1864:129545. [PMID: 32057823 DOI: 10.1016/j.bbagen.2020.129545] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Revised: 12/21/2019] [Accepted: 01/30/2020] [Indexed: 10/25/2022]
Abstract
Computational predictions of ligand binding is a difficult problem, with more accurate methods being extremely computationally expensive. The use of machine learning for drug binding predictions could possibly leverage the use of biomedical big data in exchange for time-intensive simulations. This paper reviews current trends in the use of machine learning for drug binding predictions, data sources to develop machine learning algorithms, and potential problems that may lead to overfitting and ungeneralizable models. A few popular datasets that can be used to develop virtual high-throughput screening models are characterized using spatial statistics to quantify potential biases. We can see from evaluating some common benchmarks that good performance correlates with models with high-predicted bias scores and models with low bias scores do not have much predictive power. A better understanding of the limits of available data sources and how to fix them will lead to more generalizable models that will lead to novel drug discovery.
Collapse
Affiliation(s)
- Sally R Ellingson
- College of Medicine, Division of Biomedical Informatics, University of Kentucky, Lexington, KY, United States of America; Markey Cancer Center, Lexington, KY, United States of America.
| | - Brian Davis
- Markey Cancer Center, Lexington, KY, United States of America
| | - Jonathan Allen
- Lawrence Livermore National Laboratory, Livermore, CA, United States of America
| |
Collapse
|
234
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
235
|
Pliakos K, Vens C. Drug-target interaction prediction with tree-ensemble learning and output space reconstruction. BMC Bioinformatics 2020; 21:49. [PMID: 32033537 PMCID: PMC7006075 DOI: 10.1186/s12859-020-3379-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 01/21/2020] [Indexed: 12/21/2022] Open
Abstract
Background Computational prediction of drug-target interactions (DTI) is vital for drug discovery. The experimental identification of interactions between drugs and target proteins is very onerous. Modern technologies have mitigated the problem, leveraging the development of new drugs. However, drug development remains extremely expensive and time consuming. Therefore, in silico DTI predictions based on machine learning can alleviate the burdensome task of drug development. Many machine learning approaches have been proposed over the years for DTI prediction. Nevertheless, prediction accuracy and efficiency are persisting problems that still need to be tackled. Here, we propose a new learning method which addresses DTI prediction as a multi-output prediction task by learning ensembles of multi-output bi-clustering trees (eBICT) on reconstructed networks. In our setting, the nodes of a DTI network (drugs and proteins) are represented by features (background information). The interactions between the nodes of a DTI network are modeled as an interaction matrix and compose the output space in our problem. The proposed approach integrates background information from both drug and target protein spaces into the same global network framework. Results We performed an empirical evaluation, comparing the proposed approach to state of the art DTI prediction methods and demonstrated the effectiveness of the proposed approach in different prediction settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein networks. We show that output space reconstruction can boost the predictive performance of tree-ensemble learning methods, yielding more accurate DTI predictions. Conclusions We proposed a new DTI prediction method where bi-clustering trees are built on reconstructed networks. Building tree-ensemble learning models with output space reconstruction leads to superior prediction results, while preserving the advantages of tree-ensembles, such as scalability, interpretability and inductive setting.
Collapse
Affiliation(s)
- Konstantinos Pliakos
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium. .,ITEC, imec research group at KU Leuven, Kortrijk, Belgium.
| | - Celine Vens
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium.,ITEC, imec research group at KU Leuven, Kortrijk, Belgium
| |
Collapse
|
236
|
Wan F, Zhu Y, Hu H, Dai A, Cai X, Chen L, Gong H, Xia T, Yang D, Wang MW, Zeng J. DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 17:478-495. [PMID: 32035227 PMCID: PMC7056933 DOI: 10.1016/j.gpb.2019.04.003] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 04/29/2019] [Indexed: 12/13/2022]
Abstract
Accurate identification of compound–protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug–target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from https://github.com/FangpingWan/DeepCPI.
Collapse
Affiliation(s)
- Fangping Wan
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Yue Zhu
- The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Hailin Hu
- School of Medicine, Tsinghua University, Beijing 100084, China
| | - Antao Dai
- The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Xiaoqing Cai
- The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Ligong Chen
- School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| | - Haipeng Gong
- School of Life Science, Tsinghua University, Beijing 100084, China
| | - Tian Xia
- Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Dehua Yang
- The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.
| | - Ming-Wei Wang
- The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; Shanghai Medical College, Fudan University, Shanghai 200032, China.
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China; MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
237
|
Abstract
Background:
Identifying Drug-Target Interactions (DTIs) is a major challenge for
current drug discovery and drug repositioning. Compared to traditional experimental approaches,
in silico methods are fast and inexpensive. With the increase in open-access experimental data,
numerous computational methods have been applied to predict DTIs.
Methods:
In this study, we propose an end-to-end learning model of Factorization Machine and
Deep Neural Network (FM-DNN), which emphasizes both low-order (first or second order) and
high-order (higher than second order) feature interactions without any feature engineering other
than raw features. This approach combines the power of FM and DNN learning for feature
learning in a new neural network architecture.
Results:
The experimental DTI basic features include drug characteristics (609), target
characteristics (1819), plus drug ID, target ID, total 2430. We compare 8 models such as SVM,
GBDT, WIDE-DEEP etc, the FM-DNN algorithm model obtains the best results of AUC(0.8866)
and AUPR(0.8281).
Conclusion:
Feature engineering is a job that requires expert knowledge, it is often difficult and
time-consuming to achieve good results. FM-DNN can auto learn a lower-order expression by FM
and a high-order expression by DNN.FM-DNN model has outstanding advantages over other
commonly used models.
Collapse
Affiliation(s)
- Jihong Wang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| | - Hao Wang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| | - Xiaodan Wang
- School of Pharmaceutical Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, No. 9- 13 Wuguishan Avenue of Life Street, 528458, Zhongshan, China
| | - Huiyou Chang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| |
Collapse
|
238
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 185] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
239
|
Hu S, Zhang C, Chen P, Gu P, Zhang J, Wang B. Predicting drug-target interactions from drug structure and protein sequence using novel convolutional neural networks. BMC Bioinformatics 2019; 20:689. [PMID: 31874614 PMCID: PMC6929541 DOI: 10.1186/s12859-019-3263-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Accurate identification of potential interactions between drugs and protein targets is a critical step to accelerate drug discovery. Despite many relative experimental researches have been done in the past decades, detecting drug-target interactions (DTIs) remains to be extremely resource-intensive and time-consuming. Therefore, many computational approaches have been developed for predicting drug-target associations on a large scale. Results In this paper, we proposed an deep learning-based method to predict DTIs only using the information of drug structures and protein sequences. The final results showed that our method can achieve good performance with the accuracies up to 92.0%, 90.0%, 92.0% and 90.7% for the target families of enzymes, ion channels, GPCRs and nuclear receptors of our created dataset, respectively. Another dataset derived from DrugBank was used to further assess the generalization of the model, which yielded an accuracy of 0.9015 and an AUC value of 0.9557. Conclusion It was elucidated that our model shows improved performance in comparison with other state-of-the-art computational methods on the common benchmark datasets. Experimental results demonstrated that our model successfully extracted more nuanced yet useful features, and therefore can be used as a practical tool to discover new drugs. Availability http://deeplearner.ahu.edu.cn/web/CnnDTI.htm.
Collapse
Affiliation(s)
- ShanShan Hu
- School of Computer Science and Technology, Anhui University, Jiulong Road, Hefei, 230601, China
| | - Chenglin Zhang
- Institutes of Physical Science and Information Technology, Anhui University, Jiulong Road, Hefei, 230601, China
| | - Peng Chen
- School of Computer Science and Technology, Anhui University, Jiulong Road, Hefei, 230601, China. .,Institutes of Physical Science and Information Technology, Anhui University, Jiulong Road, Hefei, 230601, China. .,Cadre's Ward (South District), The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China.
| | - Pengying Gu
- Cadre's Ward (South District), The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China.
| | - Jun Zhang
- School of Electrical and Information Engineering, Anhui University, Hefei, 230601, China
| | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan, 243032, China
| |
Collapse
|
240
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
241
|
Wang R, Li S, Cheng L, Wong MH, Leung KS. Predicting associations among drugs, targets and diseases by tensor decomposition for drug repositioning. BMC Bioinformatics 2019; 20:628. [PMID: 31839008 PMCID: PMC6912989 DOI: 10.1186/s12859-019-3283-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Development of new drugs is a time-consuming and costly process, and the cost is still increasing in recent years. However, the number of drugs approved by FDA every year per dollar spent on development is declining. Drug repositioning, which aims to find new use of existing drugs, attracts attention of pharmaceutical researchers due to its high efficiency. A variety of computational methods for drug repositioning have been proposed based on machine learning approaches, network-based approaches, matrix decomposition approaches, etc. RESULTS: We propose a novel computational method for drug repositioning. We construct and decompose three-dimensional tensors, which consist of the associations among drugs, targets and diseases, to derive latent factors reflecting the functional patterns of the three kinds of entities. The proposed method outperforms several baseline methods in recovering missing associations. Most of the top predictions are validated by literature search and computational docking. Latent factors are used to cluster the drugs, targets and diseases into functional groups. Topological Data Analysis (TDA) is applied to investigate the properties of the clusters. We find that the latent factors are able to capture the functional patterns and underlying molecular mechanisms of drugs, targets and diseases. In addition, we focus on repurposing drugs for cancer and discover not only new therapeutic use but also adverse effects of the drugs. In the in-depth study of associations among the clusters of drugs, targets and cancer subtypes, we find there exist strong associations between particular clusters. CONCLUSIONS The proposed method is able to recover missing associations, discover new predictions and uncover functional clusters of drugs, targets and diseases. The clustering of drugs, targets and diseases, as well as the associations among the clusters, provides a new guiding framework for drug repositioning.
Collapse
Affiliation(s)
- Ran Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Shuai Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Lixin Cheng
- Department of Critical Care Medicine, Shenzhen People’s Hospital, The Second Clinical Medicine College of Ji’nan University, Shenzhen, China
| | - Man Hon Wong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Kwong Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
242
|
Zhang W, Lin W, Zhang D, Wang S, Shi J, Niu Y. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Curr Drug Metab 2019; 20:194-202. [PMID: 30129407 DOI: 10.2174/1389200219666180821094047] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/18/2018] [Accepted: 03/19/2018] [Indexed: 12/28/2022]
Abstract
BACKGROUND The identification of drug-target interactions is a crucial issue in drug discovery. In recent years, researchers have made great efforts on the drug-target interaction predictions, and developed databases, software and computational methods. RESULTS In the paper, we review the recent advances in machine learning-based drug-target interaction prediction. First, we briefly introduce the datasets and data, and summarize features for drugs and targets which can be extracted from different data. Since drug-drug similarity and target-target similarity are important for many machine learning prediction models, we introduce how to calculate similarities based on data or features. Different machine learningbased drug-target interaction prediction methods can be proposed by using different features or information. Thus, we summarize, analyze and compare different machine learning-based prediction methods. CONCLUSION This study provides the guide to the development of computational methods for the drug-target interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiran Lin
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Ding Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Siman Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan 430074, China
| |
Collapse
|
243
|
Chaoming L. Prediction and analysis of sphere motion trajectory based on deep learning algorithm optimization. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-179209] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Liang Chaoming
- Guangzhou Institute of Physical Education, Guangzhou, China
| |
Collapse
|
244
|
Pichler M, Boreux V, Klein A, Schleuning M, Hartig F. Machine learning algorithms to infer trait‐matching and predict species interactions in ecological networks. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13329] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
| | - Virginie Boreux
- Nature Conservation and Landscape Ecology University of Freiburg Freiburg Germany
| | | | - Matthias Schleuning
- Senckenberg Biodiversity and Climate Research Centre (SBiK‐F) Frankfurt (Main) Germany
| | - Florian Hartig
- Theoretical Ecology University of Regensburg Regensburg Germany
| |
Collapse
|
245
|
Mahmud SMH, Chen W, Meng H, Jahan H, Liu Y, Hasan SMM. Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Anal Biochem 2019; 589:113507. [PMID: 31734254 DOI: 10.1016/j.ab.2019.113507] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 11/05/2019] [Accepted: 11/08/2019] [Indexed: 12/29/2022]
Abstract
Accurate identification of drug-target interaction (DTI) is a crucial and challenging task in the drug discovery process, having enormous benefit to the patients and pharmaceutical company. The traditional wet-lab experiments of DTI is expensive, time-consuming, and labor-intensive. Therefore, many computational techniques have been established for this purpose; although a huge number of interactions are still undiscovered. Here, we present pdti-EssB, a new computational model for identification of DTI using protein sequence and drug molecular structure. More specifically, each drug molecule is transformed as the molecular substructure fingerprint. For a protein sequence, different descriptors are utilized to represent its evolutionary, sequence, and structural information. Besides, our proposed method uses data balancing techniques to handle the imbalance problem and applies a novel feature eliminator to extract the best optimal features for accurate prediction. In this paper, four classes of DTI benchmark datasets are used to construct a predictive model with XGBoost. Here, the auROC is utilized as an evaluation metric to compare the performance of pdti-EssB method with recent methods, applying five-fold cross-validation. Finally, the experimental results indicate that our proposed method is able to outperform other approaches in predicting DTI, and introduces new drug-target interaction samples based on prediction probability scores. pdti-EssB webserver is available online at http://pdtiessb-uestc.com/.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Han Meng
- School of Political Science and Public Administration, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Hosney Jahan
- College of Computer Science, Sichuan University, Chengdu, 610065, China.
| | - Yongsheng Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - S M Mamun Hasan
- Department of Internal Medicine, Rangpur Medical College, Rangpur, 5400, Bangladesh.
| |
Collapse
|
246
|
Lipinski CF, Maltarollo VG, Oliveira PR, da Silva ABF, Honorio KM. Advances and Perspectives in Applying Deep Learning for Drug Design and Discovery. Front Robot AI 2019; 6:108. [PMID: 33501123 PMCID: PMC7805776 DOI: 10.3389/frobt.2019.00108] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 10/11/2019] [Indexed: 01/10/2023] Open
Abstract
Discovering (or planning) a new drug candidate involves many parameters, which makes this process slow, costly, and leading to failures at the end in some cases. In the last decades, we have witnessed a revolution in the computational area (hardware, software, large-scale computing, etc.), as well as an explosion in data generation (big data), which raises the need for more sophisticated algorithms to analyze this myriad of data. In this scenario, we can highlight the potentialities of artificial intelligence (AI) or computational intelligence (CI) as a powerful tool to analyze medicinal chemistry data. According to IEEE, computational intelligence involves the theory, the design, the application, and the development of biologically and linguistically motivated computational paradigms. In addition, CI encompasses three main methodologies: neural networks (NN), fuzzy systems, and evolutionary computation. In particular, artificial neural networks have been successfully applied in medicinal chemistry studies. A branch of the NN area that has attracted a lot of attention refers to deep learning (DL) due to its generalization power and ability to extract features from data. Therefore, in this mini-review we will briefly outline the present scope, advances, and challenges related to the use of DL in drug design and discovery, describing successful studies involving quantitative structure-activity relationships (QSAR) and virtual screening (VS) of databases containing thousands of compounds.
Collapse
Affiliation(s)
- Celio F Lipinski
- Departamento de Química e Física Molecular, Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos, Brazil
| | | | - Patricia R Oliveira
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo, Brazil
| | - Alberico B F da Silva
- Departamento de Química e Física Molecular, Instituto de Química de São Carlos, Universidade de São Paulo, São Carlos, Brazil
| | - Kathia Maria Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, Santo André, Brazil
| |
Collapse
|
247
|
Zheng L, Fan J, Mu Y. OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein-Ligand Binding Affinity Prediction. ACS OMEGA 2019; 4:15956-15965. [PMID: 31592466 PMCID: PMC6776976 DOI: 10.1021/acsomega.9b01997] [Citation(s) in RCA: 171] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 09/06/2019] [Indexed: 05/12/2023]
Abstract
Computational drug discovery provides an efficient tool for helping large-scale lead molecule screening. One of the major tasks of lead discovery is identifying molecules with promising binding affinities toward a target, a protein in general. The accuracies of current scoring functions that are used to predict the binding affinity are not satisfactory enough. Thus, machine learning or deep learning based methods have been developed recently to improve the scoring functions. In this study, a deep convolutional neural network model (called OnionNet) is introduced; its features are based on rotation-free element-pair-specific contacts between ligands and protein atoms, and the contacts are further grouped into different distance ranges to cover both the local and nonlocal interaction information between the ligand and the protein. The prediction power of the model is evaluated and compared with other scoring functions using the comparative assessment of scoring functions (CASF-2013) benchmark and the v2016 core set of the PDBbind database. The robustness of the model is further explored by predicting the binding affinities of the complexes generated from docking simulations instead of experimentally determined PDB structures.
Collapse
Affiliation(s)
- Liangzhen Zheng
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Jingrong Fan
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| |
Collapse
|
248
|
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform 2019; 20:1878-1912. [PMID: 30084866 PMCID: PMC6917215 DOI: 10.1093/bib/bby061] [Citation(s) in RCA: 249] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/25/2018] [Indexed: 01/16/2023] Open
Abstract
The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as 'virtual screening' (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance. The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.
Collapse
Affiliation(s)
- Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay, Turkey
| | - Heval Atas
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Maria Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| | - Rengul Cetin-Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Tunca Doğan
- Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK
| |
Collapse
|
249
|
Shi C, Chen J, Kang X, Zhao G, Lao X, Zheng H. Deep Learning in the Study of Protein-Related Interactions. Protein Pept Lett 2019; 27:359-369. [PMID: 31538879 DOI: 10.2174/0929866526666190723114142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 03/13/2019] [Accepted: 04/05/2019] [Indexed: 11/22/2022]
Abstract
Protein-related interaction prediction is critical to understanding life processes, biological functions, and mechanisms of drug action. Experimental methods used to determine proteinrelated interactions have always been costly and inefficient. In recent years, advances in biological and medical technology have provided us with explosive biological and physiological data, and deep learning-based algorithms have shown great promise in extracting features and learning patterns from complex data. At present, deep learning in protein research has emerged. In this review, we provide an introductory overview of the deep neural network theory and its unique properties. Mainly focused on the application of this technology in protein-related interactions prediction over the past five years, including protein-protein interactions prediction, protein-RNA\DNA, Protein- drug interactions prediction, and others. Finally, we discuss some of the challenges that deep learning currently faces.
Collapse
Affiliation(s)
- Cheng Shi
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Jiaxing Chen
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xinyue Kang
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Guiling Zhao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xingzhen Lao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
250
|
Abstract
Due to the massive data sets available for drug candidates, modern drug discovery has advanced to the big data era. Central to this shift is the development of artificial intelligence approaches to implementing innovative modeling based on the dynamic, heterogeneous, and large nature of drug data sets. As a result, recently developed artificial intelligence approaches such as deep learning and relevant modeling studies provide new solutions to efficacy and safety evaluations of drug candidates based on big data modeling and analysis. The resulting models provided deep insights into the continuum from chemical structure to in vitro, in vivo, and clinical outcomes. The relevant novel data mining, curation, and management techniques provided critical support to recent modeling studies. In summary, the new advancement of artificial intelligence in the big data era has paved the road to future rational drug development and optimization, which will have a significant impact on drug discovery procedures and, eventually, public health.
Collapse
Affiliation(s)
- Hao Zhu
- Department of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA;
| |
Collapse
|