101
|
Zhou M, Zheng C, Xu R. Combining phenome-driven drug-target interaction prediction with patients' electronic health records-based clinical corroboration toward drug discovery. Bioinformatics 2021; 36:i436-i444. [PMID: 32657406 PMCID: PMC7355254 DOI: 10.1093/bioinformatics/btaa451] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Motivation Predicting drug–target interactions (DTIs) using human phenotypic data have the potential in eliminating the translational gap between animal experiments and clinical outcomes in humans. One challenge in human phenome-driven DTI predictions is integrating and modeling diverse drug and disease phenotypic relationships. Leveraging large amounts of clinical observed phenotypes of drugs and diseases and electronic health records (EHRs) of 72 million patients, we developed a novel integrated computational drug discovery approach by seamlessly combining DTI prediction and clinical corroboration. Results We developed a network-based DTI prediction system (TargetPredict) by modeling 855 904 phenotypic and genetic relationships among 1430 drugs, 4251 side effects, 1059 diseases and 17 860 genes. We systematically evaluated TargetPredict in de novo cross-validation and compared it to a state-of-the-art phenome-driven DTI prediction approach. We applied TargetPredict in identifying novel repositioned candidate drugs for Alzheimer’s disease (AD), a disease affecting over 5.8 million people in the United States. We evaluated the clinical efficiency of top repositioned drug candidates using EHRs of over 72 million patients. The area under the receiver operating characteristic (ROC) curve was 0.97 in the de novo cross-validation when evaluated using 910 drugs. TargetPredict outperformed a state-of-the-art phenome-driven DTI prediction system as measured by precision–recall curves [measured by average precision (MAP): 0.28 versus 0.23, P-value < 0.0001]. The EHR-based case–control studies identified that the prescriptions top-ranked repositioned drugs are significantly associated with lower odds of AD diagnosis. For example, we showed that the prescription of liraglutide, a type 2 diabetes drug, is significantly associated with decreased risk of AD diagnosis [adjusted odds ratios (AORs): 0.76; 95% confidence intervals (CI) (0.70, 0.82), P-value < 0.0001]. In summary, our integrated approach that seamlessly combines computational DTI prediction and large-scale patients’ EHRs-based clinical corroboration has high potential in rapidly identifying novel drug targets and drug candidates for complex diseases. Availability and implementation nlp.case.edu/public/data/TargetPredict.
Collapse
Affiliation(s)
- Mengshi Zhou
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.,Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Chunlei Zheng
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Rong Xu
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
102
|
Wen Y, Song X, Yan B, Yang X, Wu L, Leng D, He S, Bo X. Multi-dimensional data integration algorithm based on random walk with restart. BMC Bioinformatics 2021; 22:97. [PMID: 33639858 PMCID: PMC7912853 DOI: 10.1186/s12859-021-04029-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 02/15/2021] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS Here, we propose a multi-omics data integration algorithm based on random walk with restart (RWR) on multiplex network. We call the resulting methodology Random Walk with Restart for multi-dimensional data Fusion (RWRF). RWRF uses similarity network of samples as the basis for integration. It constructs the similarity network for each data type and then connects corresponding samples of multiple similarity networks to create a multiplex sample network. By applying RWR on the multiplex network, RWRF uses stationary probability distribution to fuse similarity networks. We applied RWRF to The Cancer Genome Atlas (TCGA) data to identify subtypes in different cancer data sets. Three types of data (mRNA expression, DNA methylation, and microRNA expression data) are integrated and network clustering is conducted. Experiment results show that RWRF performs better than single data type analysis and previous integrative methods. CONCLUSIONS RWRF provides powerful support to users to decipher the cancer molecular subtypes, thus may benefit precision treatment of specific patients in clinical practice.
Collapse
Affiliation(s)
- Yuqi Wen
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China
| | - Xinyu Song
- Department of Biomedical Engineering, Chinese PLA General Hospital, Beijing, 100853, People's Republic of China
| | - Bowei Yan
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China
| | - Xiaoxi Yang
- Experimental Center, Beijing Friendship Hospital, Capital Medical University, Beijing, 100069, People's Republic of China
| | - Lianlian Wu
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China.,Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, People's Republic of China
| | - Dongjin Leng
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China
| | - Song He
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China.
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing, 100850, People's Republic of China.
| |
Collapse
|
103
|
Drug-Target Interaction Prediction Based on Adversarial Bayesian Personalized Ranking. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6690154. [PMID: 33628808 PMCID: PMC7889346 DOI: 10.1155/2021/6690154] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/17/2021] [Accepted: 01/23/2021] [Indexed: 12/13/2022]
Abstract
The prediction of drug-target interaction (DTI) is a key step in drug repositioning. In recent years, many studies have tried to use matrix factorization to predict DTI, but they only use known DTIs and ignore the features of drug and target expression profiles, resulting in limited prediction performance. In this study, we propose a new DTI prediction model named AdvB-DTI. Within this model, the features of drug and target expression profiles are associated with Adversarial Bayesian Personalized Ranking through matrix factorization. Firstly, according to the known drug-target relationships, a set of ternary partial order relationships is generated. Next, these partial order relationships are used to train the latent factor matrix of drugs and targets using the Adversarial Bayesian Personalized Ranking method, and the matrix factorization is improved by the features of drug and target expression profiles. Finally, the scores of drug-target pairs are achieved by the inner product of latent factors, and the DTI prediction is performed based on the score ranking. The proposed model effectively takes advantage of the idea of learning to rank to overcome the problem of data sparsity, and perturbation factors are introduced to make the model more robust. Experimental results show that our model could achieve a better DTI prediction performance.
Collapse
|
104
|
Peng X, Chen L, Zhou JP. Identification of Carcinogenic Chemicals with Network Embedding and Deep Learning Methods. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200414084317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Background:
Cancer is the second leading cause of human death in the world. To date,
many factors have been confirmed to be the cause of cancer. Among them, carcinogenic chemicals
have been widely accepted as the important ones. Traditional methods for detecting carcinogenic
chemicals are of low efficiency and high cost.
Objective:
The aim of this study was to design an efficient computational method for the
identification of carcinogenic chemicals.
Methods:
A new computational model was proposed for detecting carcinogenic chemicals. As a
data-driven model, carcinogenic and non-carcinogenic chemicals were obtained from Carcinogenic
Potency Database (CPDB). These chemicals were represented by features extracted from five
chemical networks, representing five types of chemical associations, via a network embedding
method, Mashup. Obtained features were fed into a powerful deep learning method, recurrent
neural network, to build the model.
Results:
The jackknife test on such model provided the F-measure of 0.971 and AUROC of 0.971.
Conclusion:
The proposed model was quite effective and was superior to the models with
traditional machine learning algorithms, classic chemical encoding schemes or direct usage of
chemical associations.
Collapse
Affiliation(s)
- Xuefei Peng
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Jian-Peng Zhou
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
105
|
Peng J, Wang Y, Guan J, Li J, Han R, Hao J, Wei Z, Shang X. An end-to-end heterogeneous graph representation learning-based framework for drug-target interaction prediction. Brief Bioinform 2021; 22:6124914. [PMID: 33517357 DOI: 10.1093/bib/bbaa430] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/01/2020] [Accepted: 12/23/2020] [Indexed: 12/28/2022] Open
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a key step in drug discovery. Although many related experimental studies have been carried out for identifying DTIs in the past few decades, the biological experiment-based DTI identification is still timeconsuming and expensive. Therefore, it is of great significance to develop effective computational methods for identifying DTIs. In this paper, we develop a novel 'end-to-end' learning-based framework based on heterogeneous 'graph' convolutional networks for 'DTI' prediction called end-to-end graph (EEG)-DTI. Given a heterogeneous network containing multiple types of biological entities (i.e. drug, protein, disease, side-effect), EEG-DTI learns the low-dimensional feature representation of drugs and targets using a graph convolutional networks-based model and predicts DTIs based on the learned features. During the training process, EEG-DTI learns the feature representation of nodes in an end-to-end mode. The evaluation test shows that EEG-DTI performs better than existing state-of-art methods. The data and source code are available at: https://github.com/MedicineBiology-AI/EEG-DTI.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yuxian Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Jiaojiao Guan
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Jingyi Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| | - Ruijiang Han
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jianye Hao
- College of Intelligence and Computing, Tianjin University, Tianjin 300072, China
| | - Zhongyu Wei
- School of Data Science, Fudan University, Shanghai 200433, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China.,Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an 710072, China
| |
Collapse
|
106
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
107
|
Wei T, Fa B, Luo C, Johnston L, Zhang Y, Yu Z. An Efficient and Easy-to-Use Network-Based Integrative Method of Multi-Omics Data for Cancer Genes Discovery. Front Genet 2021; 11:613033. [PMID: 33488678 PMCID: PMC7820902 DOI: 10.3389/fgene.2020.613033] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 11/25/2020] [Indexed: 12/25/2022] Open
Abstract
Identifying personalized driver genes is essential for discovering critical biomarkers and developing effective personalized therapies of cancers. However, few methods consider weights for different types of mutations and efficiently distinguish driver genes over a larger number of passenger genes. We propose MinNetRank (Minimum used for Network-based Ranking), a new method for prioritizing cancer genes that sets weights for different types of mutations, considers the incoming and outgoing degree of interaction network simultaneously, and uses minimum strategy to integrate multi-omics data. MinNetRank prioritizes cancer genes among multi-omics data for each sample. The sample-specific rankings of genes are then integrated into a population-level ranking. When evaluating the accuracy and robustness of prioritizing driver genes, our method almost always significantly outperforms other methods in terms of precision, F1 score, and partial area under the curve (AUC) on six cancer datasets. Importantly, MinNetRank is efficient in discovering novel driver genes. SP1 is selected as a candidate driver gene only by our method (ranked top three), and SP1 RNA and protein differential expression between tumor and normal samples are statistically significant in liver hepatocellular carcinoma. The top seven genes stratify patients into two subtypes exhibiting statistically significant survival differences in five cancer types. These top seven genes are associated with overall survival, as illustrated by previous researchers. MinNetRank can be very useful for identifying cancer driver genes, and these biologically relevant marker genes are associated with clinical outcome. The R package of MinNetRank is available at https://github.com/weitinging/MinNetRank.
Collapse
Affiliation(s)
- Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Chengwen Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Luke Johnston
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Yue Zhang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
108
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
109
|
Peng J, Lu G, Shang X. A Survey of Network Representation Learning Methods for Link Prediction in Biological Network. Curr Pharm Des 2021; 26:3076-3084. [PMID: 31951161 DOI: 10.2174/1381612826666200116145057] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 01/09/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Networks are powerful resources for describing complex systems. Link prediction is an important issue in network analysis and has important practical application value. Network representation learning has proven to be useful for network analysis, especially for link prediction tasks. OBJECTIVE To review the application of network representation learning on link prediction in a biological network, we summarize recent methods for link prediction in a biological network and discuss the application and significance of network representation learning in link prediction task. METHOD & RESULTS We first introduce the widely used link prediction algorithms, then briefly introduce the development of network representation learning methods, focusing on a few widely used methods, and their application in biological network link prediction. Existing studies demonstrate that using network representation learning to predict links in biological networks can achieve better performance. In the end, some possible future directions have been discussed.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Guilin Lu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, China
| |
Collapse
|
110
|
Lakizadeh A, Hassan Mir-Ashrafi SM. Drug repurposing improvement using a novel data integration framework based on the drug side effect. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100523] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|
111
|
Ding Y, Tang J, Guo F. The Computational Models of Drug-target Interaction Prediction. Protein Pept Lett 2020; 27:348-358. [PMID: 30968771 DOI: 10.2174/0929866526666190410124110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 02/22/2019] [Accepted: 04/02/2019] [Indexed: 12/19/2022]
Abstract
The identification of Drug-Target Interactions (DTIs) is an important process in drug discovery and medical research. However, the tradition experimental methods for DTIs identification are still time consuming, extremely expensive and challenging. In the past ten years, various computational methods have been developed to identify potential DTIs. In this paper, the identification methods of DTIs are summarized. What's more, several state-of-the-art computational methods are mainly introduced, containing network-based method and machine learning-based method. In particular, for machine learning-based methods, including the supervised and semisupervised models, have essential differences in the approach of negative samples. Although these effective computational models in identification of DTIs have achieved significant improvements, network-based and machine learning-based methods have their disadvantages, respectively. These computational methods are evaluated on four benchmark data sets via values of Area Under the Precision Recall curve (AUPR).
Collapse
Affiliation(s)
- Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
112
|
Yu D, Liu G, Zhao N, Liu X, Guo M. FPSC-DTI: drug-target interaction prediction based on feature projection fuzzy classification and super cluster fusion. Mol Omics 2020; 16:583-591. [PMID: 33084702 DOI: 10.1039/d0mo00062k] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Identifying drug-target interactions (DTIs) is an important part of drug discovery and development. However, identifying DTIs is a complex process that is time consuming, costly, long, and often inefficient, with a low success rate, especially with wet-experimental methods. Computational methods based on drug repositioning and network pharmacology can effectively overcome these defects. In this paper, we develop a new fusion method, called FPSC-DTI, that fuses feature projection fuzzy classification (FP) and super cluster classification (SC) to predict DTI. As the experimental result, the mean percentile ranking (MPR) that was yielded by FPSC-DTI achieved 0.043, 0.084, 0.072, and 0.146 on enzyme, ion channel (IC), G-protein-coupled receptor (GPCR), and nuclear receptor (NR) datasets, respectively. And the AUC values exceeded 0.969 over all four datasets. Compared with other methods, FPSC-DTI obtained better predictive performance and became more robust.
Collapse
Affiliation(s)
- Donghua Yu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| | | | | | | | | |
Collapse
|
113
|
Wang M, Zhu P. MRWMDA: A novel framework to infer miRNA-disease associations. Biosystems 2020; 199:104292. [PMID: 33221377 DOI: 10.1016/j.biosystems.2020.104292] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 10/31/2020] [Accepted: 11/15/2020] [Indexed: 01/03/2023]
Abstract
MicroRNAs (miRNAs) are widely involved in a series of significant biological processes, which have been revealed and verified by accumulating experimental studies. The computational inference of the correlation between miRNAs and diseases is essential to facilitate the detection of disease biomarkers for disease diagnosis, prevention, treatment and prognosis. In this paper, a model with Multiple use of Random Walk with restart algorithm was introduced for the prediction of the MiRNA-Disease Association (MRWMDA). Based on diverse similarity measures, the model first implemented the random walk with restart (RWR) algorithm on the integrated similarity network to construct the topological similarity of miRNAs and diseases, which took full advantage of the network topology information. Then, the RWR algorithm was applied in the miRNA topological similarity network, and a steady probability of each miRNA-disease pair was obtained to prioritize miRNA candidates. In particular, the initial probability of the RWR algorithm was determined by utilizing the combination of the recommendation algorithm and the maximum similarity method. The proposed model achieved significant improvement in prediction compared with previous models, with an AUC of 0.9353 and an AUPR of 0.4809. In addition, case studies of breast neoplasms and lung neoplasms representing different disease types further demonstrated the excellent ability of MRWMDA in detecting potential disease-associated miRNAs. These performance analyses indicated that MRWMDA could be an effective and powerful biological computational tool in relevant biomedical studies.
Collapse
Affiliation(s)
- Meixi Wang
- School of Science, Jiangnan University, Wuxi 214122, China
| | - Ping Zhu
- School of Science, Jiangnan University, Wuxi 214122, China.
| |
Collapse
|
114
|
Li Y, Liu X, You Z, Li L, Guo J, Wang Z. A computational approach for predicting drug–target interactions from protein sequence and drug substructure fingerprint information. INT J INTELL SYST 2020. [DOI: 10.1002/int.22332] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Yang Li
- School of Computer Science & Cyberspace Security Hainan University Haikou China
| | - Xiao‐zhang Liu
- School of Computer Science & Cyberspace Security Hainan University Haikou China
| | - Zhu‐Hong You
- School of Information Engineering Xijing University Xi'an China
| | - Li‐Ping Li
- School of Information Engineering Xijing University Xi'an China
| | - Jian‐Xin Guo
- School of Information Engineering Xijing University Xi'an China
| | - Zheng Wang
- School of Information Engineering Xijing University Xi'an China
| |
Collapse
|
115
|
Peng L, Shen L, Liao L, Liu G, Zhou L. RNMFMDA: A Microbe-Disease Association Identification Method Based on Reliable Negative Sample Selection and Logistic Matrix Factorization With Neighborhood Regularization. Front Microbiol 2020; 11:592430. [PMID: 33193260 PMCID: PMC7652725 DOI: 10.3389/fmicb.2020.592430] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 09/17/2020] [Indexed: 12/22/2022] Open
Abstract
Microbes with abnormal levels have important impacts on the formation and development of various complex diseases. Identifying possible Microbe-Disease Associations (MDAs) helps to understand the mechanisms of complex diseases. However, experimental methods for MDA identification are costly and time-consuming. In this study, a new computational model, RNMFMDA, was developed to find possible MDAs. RNMFMDA contains two main processes. First, Reliable Negative MDA samples were selected based on Positive-Unlabeled (PU) learning and random walk with restart on the heterogeneous microbe-disease network. Second, Logistic Matrix Factorization with Neighborhood Regularization (LMFNR) was developed to compute the association probabilities for all microbe-disease pairs. To evaluate the performance of the proposed RNMFMDA method, we compared RNMFMDA with five state-of-the-art MDA prediction methods based on five-fold cross-validations on microbes, diseases, and MDAs. As a result, RNMFMDA obtained the best AUCs of 0.6332, 0.8669, and 0.9081, respectively for the three five-fold cross validations, significantly outperforming other models. The promising prediction performance may be attributed to the following three features: highly quality negative MDA sample selection, LMFNR-based MDA prediction model, and various biological information integration. In addition, a few predicted microbe-disease pairs with high association scores are worthy of further experimental validation.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Ling Shen
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Longjie Liao
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Guangyi Liu
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
116
|
Hasan Mahmud SM, Chen W, Jahan H, Dai B, Din SU, Dzisoo AM. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Anal Biochem 2020; 610:113978. [PMID: 33035462 DOI: 10.1016/j.ab.2020.113978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 09/23/2020] [Accepted: 09/25/2020] [Indexed: 12/13/2022]
Abstract
Drug-target interactions (DTIs) play a key role in drug development and discovery processes. Wet lab prediction of DTIs is time-consuming, expensive, and tedious. Fortunately, computational approaches can identify new interactions (drug-target pairs) and accelerate the process of drug repurposing. However, a vast number of interactions remain undiscovered; therefore, we proposed a deep learning-based method (deepACTION) for predicting potential or unknown DTIs. Here, each drug chemical structure and protein sequence are transformed according to structural and sequence information using different descriptors to represent their features correctly. There have been some challenges, such as the high dimensionality and class imbalance of data during the prediction process. To address these problems, we developed the MMIB technique to balance the majority and minority instances in the dataset and utilized a LASSO model to handle the high dimensionality of the data. In addition, we trained the convolutional neural network algorithm with balanced and reduced features for accurate prediction of DTIs. In this study, the AUC is considered a primary evaluation metric for comparing the performance of the deep ACTION model with that of existing methods by a 5-fold cross-validation test. Our experiential dataset obtained from the DrugBank database and our deepACTION model achieved an AUC of 0.9836 for this dataset. The experimental results ensured that the model can predict significant numbers of new DTIs and provide complete information to motivate scientists to develop drugs.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Hosney Jahan
- College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Bo Dai
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Salah Ud Din
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Anthony Mackitz Dzisoo
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| |
Collapse
|
117
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
118
|
Peng L, Tian X, Shen L, Kuang M, Li T, Tian G, Yang J, Zhou L. Identifying Effective Antiviral Drugs Against SARS-CoV-2 by Drug Repositioning Through Virus-Drug Association Prediction. Front Genet 2020; 11:577387. [PMID: 33193695 PMCID: PMC7525008 DOI: 10.3389/fgene.2020.577387] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 08/18/2020] [Indexed: 12/12/2022] Open
Abstract
A new coronavirus called SARS-CoV-2 is rapidly spreading around the world. Over 16,558,289 infected cases with 656,093 deaths have been reported by July 29th, 2020, and it is urgent to identify effective antiviral treatment. In this study, potential antiviral drugs against SARS-CoV-2 were identified by drug repositioning through Virus-Drug Association (VDA) prediction. 96 VDAs between 11 types of viruses similar to SARS-CoV-2 and 78 small molecular drugs were extracted and a novel VDA identification model (VDA-RLSBN) was developed to find potential VDAs related to SARS-CoV-2. The model integrated the complete genome sequences of the viruses, the chemical structures of drugs, a regularized least squared classifier (RLS), a bipartite local model, and the neighbor association information. Compared with five state-of-the-art association prediction methods, VDA-RLSBN obtained the best AUC of 0.9085 and AUPR of 0.6630. Ribavirin was predicted to be the best small molecular drug, with a higher molecular binding energy of -6.39 kcal/mol with human angiotensin-converting enzyme 2 (ACE2), followed by remdesivir (-7.4 kcal/mol), mycophenolic acid (-5.35 kcal/mol), and chloroquine (-6.29 kcal/mol). Ribavirin, remdesivir, and chloroquine have been under clinical trials or supported by recent works. In addition, for the first time, our results suggested several antiviral drugs, such as FK506, with molecular binding energies of -11.06 and -10.1 kcal/mol with ACE2 and the spike protein, respectively, could be potentially used to prevent SARS-CoV-2 and remains to further validation. Drug repositioning through virus-drug association prediction can effectively find potential antiviral drugs against SARS-CoV-2.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xiongfei Tian
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Ling Shen
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Ming Kuang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Tianbao Li
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
119
|
Identification of Drug–Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106254] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
120
|
Zhang X, Chen L. Prediction of membrane protein types by fusing protein-protein interaction and protein sequence information. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020; 1868:140524. [PMID: 32858174 DOI: 10.1016/j.bbapap.2020.140524] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/17/2020] [Accepted: 07/30/2020] [Indexed: 11/30/2022]
Abstract
Membrane proteins are gatekeepers to the cell and essential for determination of the function of cells. Identification of the types of membrane proteins is an essential problem in cell biology. It is time-consuming and expensive to identify the type of membrane proteins with traditional experimental methods. The alternative way is to design effective computational methods, which can provide quick and reliable predictions. To date, several computational methods have been proposed in this regard. Several of them used the features extracted from the sequence information of individual proteins. Recently, networks are more and more popular to tackle different protein-related problems, which can organize proteins in a system level and give an overview of all proteins. However, such form weakens the essential properties of proteins, such as their sequence information. In this study, a novel feature fusion scheme was proposed, which integrated the information of protein sequences and protein-protein interaction network. The fused features of a protein were defined as the linear combination of sequence features of all proteins in the network, where the combination coefficients were the probabilities yielded by the random walk with restart algorithm with the protein as the seed node. Several models with such fused features and different classification algorithms were built and evaluated. Their performance for predicting the type of membrane proteins was improved compared with the models only with the sequence features or network information.
Collapse
Affiliation(s)
- Xiaolin Zhang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People's Republic of China.
| |
Collapse
|
121
|
Abstract
Network theory provides one of the most potent analysis tools for the study of complex systems. In this paper, we illustrate the network-based perspective in drug research and how it is coherent with the new paradigm of drug discovery. We first present data sources from which networks are built, then show some examples of how the networks can be used to investigate drug-related systems. A section is devoted to network-based inference applications, i.e., prediction methods based on interactomes, that can be used to identify putative drug-target interactions without resorting to 3D modeling. Finally, we present some aspects of Boolean networks dynamics, anticipating that it might become a very potent modeling framework to develop in silico screening protocols able to simulate phenotypic screening experiments. We conclude that network applications integrated with machine learning and 3D modeling methods will become an indispensable tool for computational drug discovery in the next years.
Collapse
Affiliation(s)
- Maurizio Recanatini
- Department of Pharmacy and
Biotechnology, Alma Mater Studiorum—University of Bologna, Via Belmeloro 6, I-40126 Bologna, Italy
| | - Chiara Cabrelle
- Department of Pharmacy and
Biotechnology, Alma Mater Studiorum—University of Bologna, Via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
122
|
Dai W, Li L, Guo D. Integrating bioassay data for improved prediction of drug-target interaction. Biophys Chem 2020; 266:106455. [PMID: 32835911 DOI: 10.1016/j.bpc.2020.106455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 08/06/2020] [Accepted: 08/06/2020] [Indexed: 11/26/2022]
Abstract
Identifying drug targets is one of the major tasks in drug discovery. As experimental identification of targets is rather challenging, development of computational methods is necessary for efficient identification of drug-target interaction. Traditional computational method, such as docking, is based solely on the chemical structure, which is not available for most of the targets. On the other hand, bioassay data might contain information helpful for prediction of drug-target interaction. In this study, a feature enrichment method integrating bioassay and chemical structure data was developed to predict drug-target interaction. Using a large-scale benchmark on the datasets, we demonstrated that the model adopting integrated fingerprint outperformed the one using chemical fingerprint. Influence of the false positive hits in bioassays and algorithm-related factors on the model performance were also investigated. The results suggested that prediction by using integrated fingerprint was robust to false positive hits, the choice of classifiers, and different random splits of the datasets.
Collapse
Affiliation(s)
- Weixing Dai
- School of Life Science and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Li Li
- Department of Pharmacy, The Eighth Affiliated Hospital, Sun Yat-sen University, Shennan Road 3025, Shenzhen 518000, China
| | - Dianjing Guo
- School of Life Science and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong.
| |
Collapse
|
123
|
Zhao Y, Wang CC, Chen X. Microbes and complex diseases: from experimental results to computational models. Brief Bioinform 2020; 22:5882184. [PMID: 32766753 DOI: 10.1093/bib/bbaa158] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/19/2020] [Accepted: 06/22/2020] [Indexed: 12/13/2022] Open
Abstract
Studies have shown that the number of microbes in humans is almost 10 times that of cells. These microbes have been proven to play an important role in a variety of physiological processes, such as enhancing immunity, improving the digestion of gastrointestinal tract and strengthening metabolic function. In addition, in recent years, more and more research results have indicated that there are close relationships between the emergence of the human noncommunicable diseases and microbes, which provides a novel insight for us to further understand the pathogenesis of the diseases. An in-depth study about the relationships between diseases and microbes will not only contribute to exploring new strategies for the diagnosis and treatment of diseases but also significantly heighten the efficiency of new drugs development. However, applying the methods of biological experimentation to reveal the microbe-disease associations is costly and inefficient. In recent years, more and more researchers have constructed multiple computational models to predict microbes that are potentially associated with diseases. Here, we start with a brief introduction of microbes and databases as well as web servers related to them. Then, we mainly introduce four kinds of computational models, including score function-based models, network algorithm-based models, machine learning-based models and experimental analysis-based models. Finally, we summarize the advantages as well as disadvantages of them and set the direction for the future work of revealing microbe-disease associations based on computational models. We firmly believe that computational models are expected to be important tools in large-scale predictions of disease-related microbes.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining
| |
Collapse
|
124
|
Drug-target interactions prediction using marginalized denoising model on heterogeneous networks. BMC Bioinformatics 2020; 21:330. [PMID: 32703151 PMCID: PMC7653902 DOI: 10.1186/s12859-020-03662-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 07/14/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Drugs achieve pharmacological functions by acting on target proteins. Identifying interactions between drugs and target proteins is an essential task in old drug repositioning and new drug discovery. To recommend new drug candidates and reposition existing drugs, computational approaches are commonly adopted. Compared with the wet-lab experiments, the computational approaches have lower cost for drug discovery and provides effective guidance in the subsequent experimental verification. How to integrate different types of biological data and handle the sparsity of drug-target interaction data are still great challenges. RESULTS In this paper, we propose a novel drug-target interactions (DTIs) prediction method incorporating marginalized denoising model on heterogeneous networks with association index kernel matrix and latent global association. The experimental results on benchmark datasets and new compiled datasets indicate that compared to other existing methods, our method achieves higher scores of AUC (area under curve of receiver operating characteristic) and larger values of AUPR (area under precision-recall curve). CONCLUSIONS The performance improvement in our method depends on the association index kernel matrix and the latent global association. The association index kernel matrix calculates the sharing relationship between drugs and targets. The latent global associations address the false positive issue caused by network link sparsity. Our method can provide a useful approach to recommend new drug candidates and reposition existing drugs.
Collapse
|
125
|
Chen H, Cheng F, Li J. iDrug: Integration of drug repositioning and drug-target prediction via cross-network embedding. PLoS Comput Biol 2020; 16:e1008040. [PMID: 32667925 PMCID: PMC7384678 DOI: 10.1371/journal.pcbi.1008040] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 07/27/2020] [Accepted: 06/10/2020] [Indexed: 12/14/2022] Open
Abstract
Computational drug repositioning and drug-target prediction have become essential tasks in the early stage of drug discovery. In previous studies, these two tasks have often been considered separately. However, the entities studied in these two tasks (i.e., drugs, targets, and diseases) are inherently related. On one hand, drugs interact with targets in cells to modulate target activities, which in turn alter biological pathways to promote healthy functions and to treat diseases. On the other hand, both drug repositioning and drug-target prediction involve the same drug feature space, which naturally connects these two problems and the two domains (diseases and targets). By using the wisdom of the crowds, it is possible to transfer knowledge from one of the domains to the other. The existence of relationships among drug-target-disease motivates us to jointly consider drug repositioning and drug-target prediction in drug discovery. In this paper, we present a novel approach called iDrug, which seamlessly integrates drug repositioning and drug-target prediction into one coherent model via cross-network embedding. In particular, we provide a principled way to transfer knowledge from these two domains and to enhance prediction performance for both tasks. Using real-world datasets, we demonstrate that iDrug achieves superior performance on both learning tasks compared to several state-of-the-art approaches. Our code and datasets are available at: https://github.com/Case-esaC/iDrug.
Collapse
Affiliation(s)
- Huiyuan Chen
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Jing Li
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, Ohio, United States of America
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- * E-mail:
| |
Collapse
|
126
|
Eslami Manoochehri H, Nourani M. Drug-target interaction prediction using semi-bipartite graph model and deep learning. BMC Bioinformatics 2020; 21:248. [PMID: 32631230 PMCID: PMC7336396 DOI: 10.1186/s12859-020-3518-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Identifying drug-target interaction is a key element in drug discovery. In silico prediction of drug-target interaction can speed up the process of identifying unknown interactions between drugs and target proteins. In recent studies, handcrafted features, similarity metrics and machine learning methods have been proposed for predicting drug-target interactions. However, these methods cannot fully learn the underlying relations between drugs and targets. In this paper, we propose anew framework for drug-target interaction prediction that learns latent features from drug-target interaction network. RESULTS We present a framework to utilize the network topology and identify interacting and non-interacting drug-target pairs. We model the problem as a semi-bipartite graph in which we are able to use drug-drug and protein-protein similarity in a drug-protein network. We have then used a graph labeling method for vertex ordering in our graph embedding process. Finally, we employed deep neural network to learn the complex pattern of interacting pairs from embedded graphs. We show our approach is able to learn sophisticated drug-target topological features and outperforms other state-of-the-art approaches. CONCLUSIONS The proposed learning model on semi-bipartite graph model, can integrate drug-drug and protein-protein similarities which are semantically different than drug-protein information in a drug-target interaction network. We show our model can determine interaction likelihood for each drug-target pair and outperform other heuristics.
Collapse
Affiliation(s)
- Hafez Eslami Manoochehri
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA
| | - Mehrdad Nourani
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX, 75080, USA.
| |
Collapse
|
127
|
Luo J, Long Y. NTSHMDA: Prediction of Human Microbe-Disease Association Based on Random Walk by Integrating Network Topological Similarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1341-1351. [PMID: 30489271 DOI: 10.1109/tcbb.2018.2883041] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Accumulating clinic evidences have demonstrated that the microbes residing in human bodies play a significantly important role in the formation, development, and progression of various complex human diseases. Identifying latent related microbes for disease could provide insight into human disease mechanisms and promote disease prevention, diagnosis, and treatment. In this paper, we first construct a heterogeneous network by connecting the disease similarity network and the microbe similarity network through known microbe-disease association network, and then develop a novel computational model to predict human microbe-disease associations based on random walk by integrating network topological similarity (NTSHMDA). Specifically, each microbe-disease association pair is regarded as a distinct relationship level and, thus, assigned different weights based on network topological similarity. The experimental results show that NTSHMDA outperforms some state-of-the-art methods with average AUCs of 0.9070, 0.8896 ± 0.0038 in the frameworks of Leave-one-out cross validation and 5-fold cross validation, respectively. In case studies, 9, 18, 38 and 9, 18, 45 out of top-10, 20, 50 candidate microbes are verified by recently published literatures for asthma and inflammatory bowel disease, respectively. In conclusion, NTSHMDA has potential ability to identify novel disease-microbe associations and can also provide valuable information for drug discovery and biological researches.
Collapse
|
128
|
Li D, Lv B, Wang D, Xu D, Qin S, Zhang Y, Chen J, Zhang W, Zhang Z, Xu F. Network Pharmacology and Bioactive Equivalence Assessment Integrated Strategy Driven Q-markers Discovery for Da-Cheng-Qi Decoction to Attenuate Intestinal Obstruction. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2020; 72:153236. [PMID: 32464544 DOI: 10.1016/j.phymed.2020.153236] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Revised: 04/14/2020] [Accepted: 04/24/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND Intestinal obstruction (IO) is a kind of acute abdomen with high morbidity and mortality. Patients suffer from poor quality of life and tremendous financial pressure. Da-Cheng-Qi decoction (DCQD), a classical purgation prescription, has clinically been proven to be an effective treatment for IO. PURPOSE Network pharmacology integrated with bioactive equivalence assessment was used to discover the quality marker (Q-marker) of DCQD against IO. METHODS As there is hardly any targets recorded in database, thus the collection of IO targets was conducted by searching those of alternative diseases which have similar pathological symptoms with IO. In order to improve the reliability of the obtained targets, IO metabolomics data was introduced. Active compounds combination (ACC) was focused as potential Q-markers via component-target network analysis and function query from the identified components corresponding to the common targets. Bioequivalence between ACC and DCQD was assessed from the aspects of intestine motility (somatostatin secretion), inflammation (IL-6 secretion) and injury (wound healing assay) in vitro and was further validated in ileus rat model. PPI network analysis of core targets followed by gene pedigree classification and experimental validation confirmed the potential intervention pathway. RESULTS A combination of 11 ingredients, including emodin, physcion, aloe-emodin, rhein, chrysophanol, gallic acid, magnolol, honokiol, naringenin, tangeretin, and nobiletin was finally confirmed bioequivalence with DQCD to some extent and could serve as Q-markers for DCQD to attenuate IO. PI3K/AKT was verified as a possible affected pathway that DCQD exerted the effectiveness against IO. CONCLUSION For the disease with few recorded targets, searching those of alternative diseases which have similar pathological symptoms could be a feasible and effective approach. The proposed network pharmacology integrated bioactive equivalence evaluation paradigm is efficient to discover Q-marker of herbal formulae.
Collapse
Affiliation(s)
- Danting Li
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Bo Lv
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Di Wang
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Doudou Xu
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Siyuan Qin
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Ying Zhang
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Jie Chen
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China
| | - Wei Zhang
- State Key Laboratory for Quality Research in Chinese Medicines, Macau University of Science and Technology, Taipa, Macau, China
| | - Zunjian Zhang
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China.
| | - Fengguo Xu
- Key Laboratory of Drug Quality Control and Pharmacovigilance (Ministry of Education), State Key Laboratory of Natural Medicine, China Pharmaceutical University, Nanjing 210009, P. R. China.
| |
Collapse
|
129
|
Wang L, You ZH, Li LP, Yan X, Zhang W, Song KJ, Song CD. Identification of potential drug-targets by combining evolutionary information extracted from frequency profiles and molecular topological structures. Chem Biol Drug Des 2020; 96:758-767. [PMID: 31393672 DOI: 10.1111/cbdd.13599] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 07/29/2019] [Accepted: 08/03/2019] [Indexed: 01/09/2023]
Abstract
Identifying interactions among drug compounds and target proteins is the basis of drug research and plays a crucial role in drug discovery. However, determining drug-target interactions (DTIs) and potential protein-compound interactions by biological experiment-based method alone is a very complicated, expensive, and time-consuming process. Hence, there is an intense motivation to design in silico prediction methods to overcome these obstacles. In this work, we designed a novel in silico strategy to predict proteome-scale DTIs based on the assumption that DTI pairs can be expressed through the evolutionary information derived from frequency profiles and drugs' structural properties. To achieve this, drug molecules are encoded into the substructure fingerprints to represent certain fragments; target proteins are first converted into position-specific scoring matrix (PSSM) and then encoded as 2-dimensional principal component analysis (2DPCA) descriptors. In the prediction phase, the feature weighted rotation forest (RF) classifier is used to estimate whether drug and target interact with each other on four benchmark datasets, including Enzymes, Ion Channels, GPCRs, and Nuclear Receptors. The prediction accuracy of cross-validation on the four datasets is 95.40%, 88.82%, 85.67%, and 82.22%, respectively. In order to have a clearer assessment of the proposed approach, we compared it with the discrete cosine transform (DCT) descriptor model, support vector machine (SVM) classifier model, and existing excellent approaches, including DBSI, NetCBP, KBMF2K, SIMCOMP, and RFDT. The excellent results of the experiment indicated that the proposed approach can effectively improve the DTI prediction accuracy and can be used as a practical tool for the research and design of new drugs.
Collapse
Affiliation(s)
- Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, China.,Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi, China
| | - Xin Yan
- School of Foreign Languages, Zaozhuang University, Zaozhuang, China
| | - Wei Zhang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, China
| | - Ke-Jian Song
- School of information engineering, JiangXi University of Science and Technology, Ganzhou, China
| | - Chuan-Dong Song
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang, China
| |
Collapse
|
130
|
Cui W, Aouidate A, Wang S, Yu Q, Li Y, Yuan S. Discovering Anti-Cancer Drugs via Computational Methods. Front Pharmacol 2020; 11:733. [PMID: 32508653 PMCID: PMC7251168 DOI: 10.3389/fphar.2020.00733] [Citation(s) in RCA: 107] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 05/01/2020] [Indexed: 12/24/2022] Open
Abstract
New drug discovery has been acknowledged as a complicated, expensive, time-consuming, and challenging project. It has been estimated that around 12 years and 2.7 billion USD, on average, are demanded for a new drug discovery via traditional drug development pipeline. How to reduce the research cost and speed up the development process of new drug discovery has become a challenging, urgent question for the pharmaceutical industry. Computer-aided drug discovery (CADD) has emerged as a powerful, and promising technology for faster, cheaper, and more effective drug design. Recently, the rapid growth of computational tools for drug discovery, including anticancer therapies, has exhibited a significant and outstanding impact on anticancer drug design, and has also provided fruitful insights into the area of cancer therapy. In this work, we discussed the different subareas of the computer-aided drug discovery process with a focus on anticancer drugs.
Collapse
Affiliation(s)
- Wenqiang Cui
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- College of Veterinary Medicine, Northeast Agricultural University, Harbin, China
| | - Adnane Aouidate
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Shouguo Wang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Qiuliyang Yu
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yanhua Li
- College of Veterinary Medicine, Northeast Agricultural University, Harbin, China
| | - Shuguang Yuan
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
131
|
Che J, Chen L, Guo ZH, Wang S, Aorigele. Drug Target Group Prediction with Multiple Drug Networks. Comb Chem High Throughput Screen 2020; 23:274-284. [DOI: 10.2174/1386207322666190702103927] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 03/11/2019] [Accepted: 04/15/2019] [Indexed: 02/07/2023]
Abstract
Background:
Identification of drug-target interaction is essential in drug discovery. It is
beneficial to predict unexpected therapeutic or adverse side effects of drugs. To date, several
computational methods have been proposed to predict drug-target interactions because they are
prompt and low-cost compared with traditional wet experiments.
Methods:
In this study, we investigated this problem in a different way. According to KEGG,
drugs were classified into several groups based on their target proteins. A multi-label classification
model was presented to assign drugs into correct target groups. To make full use of the known drug
properties, five networks were constructed, each of which represented drug associations in one
property. A powerful network embedding method, Mashup, was adopted to extract drug features
from above-mentioned networks, based on which several machine learning algorithms, including
RAndom k-labELsets (RAKEL) algorithm, Label Powerset (LP) algorithm and Support Vector
Machine (SVM), were used to build the classification model.
Results and Conclusion:
Tenfold cross-validation yielded the accuracy of 0.839, exact match of
0.816 and hamming loss of 0.037, indicating good performance of the model. The contribution of
each network was also analyzed. Furthermore, the network model with multiple networks was
found to be superior to the one with a single network and classic model, indicating the superiority
of the proposed model.
Collapse
Affiliation(s)
- Jingang Che
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Zi-Han Guo
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Aorigele
- Faculty of Engineering, University of Toyama, Toyama, Japan
| |
Collapse
|
132
|
Wang CC, Zhao Y, Chen X. Drug-pathway association prediction: from experimental results to computational models. Brief Bioinform 2020; 22:5835554. [PMID: 32393976 DOI: 10.1093/bib/bbaa061] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/16/2020] [Accepted: 03/26/2020] [Indexed: 12/14/2022] Open
Abstract
Effective drugs are urgently needed to overcome human complex diseases. However, the research and development of novel drug would take long time and cost much money. Traditional drug discovery follows the rule of one drug-one target, while some studies have demonstrated that drugs generally perform their task by affecting related pathway rather than targeting single target. Thus, the new strategy of drug discovery, namely pathway-based drug discovery, have been proposed. Obviously, identifying associations between drugs and pathways plays a key role in the development of pathway-based drug discovery. Revealing the drug-pathway associations by experiment methods would take much time and cost. Therefore, some computational models were established to predict potential drug-pathway associations. In this review, we first introduced the background of drug and the concept of drug-pathway associations. Then, some publicly accessible databases and web servers about drug-pathway associations were listed. Next, we summarized some state-of-the-art computational methods in the past years for inferring drug-pathway associations and divided these methods into three classes, namely Bayesian spare factor-based, matrix decomposition-based and other machine learning methods. In addition, we introduced several evaluation strategies to estimate the predictive performance of various computational models. In the end, we discussed the advantages and limitations of existing computational methods and provided some suggestions about the future directions of the data collection and the calculation models development.
Collapse
|
133
|
Jin S, Zeng X, Xia F, Huang W, Liu X. Application of deep learning methods in biological networks. Brief Bioinform 2020; 22:1902-1917. [PMID: 32363401 DOI: 10.1093/bib/bbaa043] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2019] [Revised: 02/19/2020] [Accepted: 03/05/2020] [Indexed: 01/07/2023] Open
Abstract
The increase in biological data and the formation of various biomolecule interaction databases enable us to obtain diverse biological networks. These biological networks provide a wealth of raw materials for further understanding of biological systems, the discovery of complex diseases and the search for therapeutic drugs. However, the increase in data also increases the difficulty of biological networks analysis. Therefore, algorithms that can handle large, heterogeneous and complex data are needed to better analyze the data of these network structures and mine their useful information. Deep learning is a branch of machine learning that extracts more abstract features from a larger set of training data. Through the establishment of an artificial neural network with a network hierarchy structure, deep learning can extract and screen the input information layer by layer and has representation learning ability. The improved deep learning algorithm can be used to process complex and heterogeneous graph data structures and is increasingly being applied to the mining of network data information. In this paper, we first introduce the used network data deep learning models. After words, we summarize the application of deep learning on biological networks. Finally, we discuss the future development prospects of this field.
Collapse
|
134
|
Kaushik AC, Mehmood A, Dai X, Wei DQ. A comparative chemogenic analysis for predicting Drug-Target Pair via Machine Learning Approaches. Sci Rep 2020; 10:6870. [PMID: 32322011 PMCID: PMC7176722 DOI: 10.1038/s41598-020-63842-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 04/04/2020] [Indexed: 12/26/2022] Open
Abstract
A computational technique for predicting the DTIs has now turned out to be an indispensable job during the process of drug finding. It tapers the exploration room for interactions by propounding possible interaction contenders for authentication through experiments of wet-lab which are known for their expensiveness and time consumption. Chemogenomics, an emerging research area focused on the systematic examination of the biological impact of a broad series of minute molecular-weighting ligands on a broad raiment of macromolecular target spots. Additionally, with the advancement in time, the complexity of the algorithms is increasing which may result in the entry of big data technologies like Spark in this field soon. In the presented work, we intend to offer an inclusive idea and realistic evaluation of the computational Drug Target Interaction projection approaches, to perform as a guide and reference for researchers who are carrying out work in a similar direction. Precisely, we first explain the data utilized in computational Drug Target Interaction prediction attempts like this. We then sort and explain the best and most modern techniques for the prediction of DTIs. Then, a realistic assessment is executed to show the projection performance of several illustrative approaches in various situations. Ultimately, we underline possible opportunities for additional improvement of Drug Target Interaction projection enactment and also linked study objectives.
Collapse
Affiliation(s)
- Aman Chandra Kaushik
- Wuxi School of Medicine, Jiangnan University, Wuxi, China.
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| | - Aamir Mehmood
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
| | - Xiaofeng Dai
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
135
|
Hao M, Bryant SH, Wang Y. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions. Brief Bioinform 2020; 20:1465-1474. [PMID: 29420684 DOI: 10.1093/bib/bby010] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 01/18/2018] [Indexed: 12/25/2022] Open
Abstract
While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred.
Collapse
|
136
|
Buza K, Peška L, Koller J. Modified linear regression predicts drug-target interactions accurately. PLoS One 2020; 15:e0230726. [PMID: 32251481 PMCID: PMC7135267 DOI: 10.1371/journal.pone.0230726] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 03/06/2020] [Indexed: 12/31/2022] Open
Abstract
State-of-the-art approaches for the prediction of drug-target interactions (DTI) are based on various techniques, such as matrix factorisation, restricted Boltzmann machines, network-based inference and bipartite local models (BLM). In this paper, we propose the framework of Asymmetric Loss Models (ALM) which is more consistent with the underlying chemical reality compared with conventional regression techniques. Furthermore, we propose to use an asymmetric loss model with BLM to predict drug-target interactions accurately. We evaluate our approach on publicly available real-world drug-target interaction datasets. The results show that our approach outperforms state-of-the-art DTI techniques, including recent versions of BLM.
Collapse
Affiliation(s)
- Krisztian Buza
- Faculty of Informatics, ELTE – Eötvös Loránd University, Budapest, Hungary
- Center for the Study of Complexity, Babes-Bolyai University, Cluj Napoca, Romania
- * E-mail:
| | - Ladislav Peška
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - Júlia Koller
- Institute of Genomic Medicine and Rare Disorders, Semmelweis University, Budapest, Hungary
| |
Collapse
|
137
|
Wang YB, You ZH, Yang S, Yi HC, Chen ZH, Zheng K. A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak 2020; 20:49. [PMID: 32183788 PMCID: PMC7079345 DOI: 10.1186/s12911-020-1052-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Background The key to modern drug discovery is to find, identify and prepare drug molecular targets. However, due to the influence of throughput, precision and cost, traditional experimental methods are difficult to be widely used to infer these potential Drug-Target Interactions (DTIs). Therefore, it is urgent to develop effective computational methods to validate the interaction between drugs and target. Methods We developed a deep learning-based model for DTIs prediction. The proteins evolutionary features are extracted via Position Specific Scoring Matrix (PSSM) and Legendre Moment (LM) and associated with drugs molecular substructure fingerprints to form feature vectors of drug-target pairs. Then we utilized the Sparse Principal Component Analysis (SPCA) to compress the features of drugs and proteins into a uniform vector space. Lastly, the deep long short-term memory (DeepLSTM) was constructed for carrying out prediction. Results A significant improvement in DTIs prediction performance can be observed on experimental results, with AUC of 0.9951, 0.9705, 0.9951, 0.9206, respectively, on four classes important drug-target datasets. Further experiments preliminary proves that the proposed characterization scheme has great advantage on feature expression and recognition. We also have shown that the proposed method can work well with small dataset. Conclusion The results demonstration that the proposed approach has a great advantage over state-of-the-art drug-target predictor. To the best of our knowledge, this study first tests the potential of deep learning method with memory and Turing completeness in DTIs prediction.
Collapse
Affiliation(s)
- Yan-Bin Wang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.
| | - Shan Yang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhan-Heng Chen
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Kai Zheng
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| |
Collapse
|
138
|
Bagherian M, Kim RB, Jiang C, Sartor MA, Derksen H, Najarian K. Coupled matrix-matrix and coupled tensor-matrix completion methods for predicting drug-target interactions. Brief Bioinform 2020; 22:2161-2171. [PMID: 32186716 PMCID: PMC7986629 DOI: 10.1093/bib/bbaa025] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 01/27/2020] [Accepted: 02/17/2020] [Indexed: 12/20/2022] Open
Abstract
Predicting the interactions between drugs and targets plays an important role in the process of new drug discovery, drug repurposing (also known as drug repositioning). There is a need to develop novel and efficient prediction approaches in order to avoid the costly and laborious process of determining drug–target interactions (DTIs) based on experiments alone. These computational prediction approaches should be capable of identifying the potential DTIs in a timely manner. Matrix factorization methods have been proven to be the most reliable group of methods. Here, we first propose a matrix factorization-based method termed ‘Coupled Matrix–Matrix Completion’ (CMMC). Next, in order to utilize more comprehensive information provided in different databases and incorporate multiple types of scores for drug–drug similarities and target–target relationship, we then extend CMMC to ‘Coupled Tensor–Matrix Completion’ (CTMC) by considering drug–drug and target–target similarity/interaction tensors. Results: Evaluation on two benchmark datasets, DrugBank and TTD, shows that CTMC outperforms the matrix-factorization-based methods: GRMF, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$L_{2,1}$\end{document}-GRMF, NRLMF and NRLMF\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\beta $\end{document}. Based on the evaluation, CMMC and CTMC outperform the above three methods in term of area under the curve, F1 score, sensitivity and specificity in a considerably shorter run time.
Collapse
Affiliation(s)
- Maryam Bagherian
- Corresponding author: Maryam Bagherian, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, USA.
| | | | | | | | | | | |
Collapse
|
139
|
Rayhan F, Ahmed S, Mousavian Z, Farid DM, Shatabda S. FRnet-DTI: Deep convolutional neural network for drug-target interaction prediction. Heliyon 2020; 6:e03444. [PMID: 32154410 PMCID: PMC7052404 DOI: 10.1016/j.heliyon.2020.e03444] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 06/16/2019] [Accepted: 02/14/2020] [Indexed: 01/09/2023] Open
Abstract
The task of drug-target interaction prediction holds significant importance in pharmacology and therapeutic drug design. In this paper, we present FRnet-DTI, an auto-encoder based feature manipulation and a convolutional neural network based classifier for drug target interaction prediction. Two convolutional neural networks are proposed: FRnet-Encode and FRnet-Predict. Here, one model is used for feature manipulation and the other one for classification. Using the first method FRnet-Encode, we generate 4096 features for each of the instances in each of the datasets and use the second method, FRnet-Predict, to identify interaction probability employing those features. We have tested our method on four gold standard datasets extensively used by other researchers. Experimental results shows that our method significantly improves over the state-of-the-art method on three out of four drug-target interaction gold standard datasets on both area under curve for Receiver Operating Characteristic (auROC) and area under Precision Recall curve (auPR) metric. We also introduce twenty new potential drug-target pairs for interaction based on high prediction scores. The source codes and implementation details of our methods are available from https://github.com/farshidrayhanuiu/FRnet-DTI/ and also readily available to use as an web application from http://farshidrayhan.pythonanywhere.com/FRnet-DTI/.
Collapse
Affiliation(s)
- Farshid Rayhan
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Sajid Ahmed
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Zaynab Mousavian
- School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Dewan Md Farid
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka-1212, Bangladesh
| |
Collapse
|
140
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
141
|
Abstract
Background:
Identifying Drug-Target Interactions (DTIs) is a major challenge for
current drug discovery and drug repositioning. Compared to traditional experimental approaches,
in silico methods are fast and inexpensive. With the increase in open-access experimental data,
numerous computational methods have been applied to predict DTIs.
Methods:
In this study, we propose an end-to-end learning model of Factorization Machine and
Deep Neural Network (FM-DNN), which emphasizes both low-order (first or second order) and
high-order (higher than second order) feature interactions without any feature engineering other
than raw features. This approach combines the power of FM and DNN learning for feature
learning in a new neural network architecture.
Results:
The experimental DTI basic features include drug characteristics (609), target
characteristics (1819), plus drug ID, target ID, total 2430. We compare 8 models such as SVM,
GBDT, WIDE-DEEP etc, the FM-DNN algorithm model obtains the best results of AUC(0.8866)
and AUPR(0.8281).
Conclusion:
Feature engineering is a job that requires expert knowledge, it is often difficult and
time-consuming to achieve good results. FM-DNN can auto learn a lower-order expression by FM
and a high-order expression by DNN.FM-DNN model has outstanding advantages over other
commonly used models.
Collapse
Affiliation(s)
- Jihong Wang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| | - Hao Wang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| | - Xiaodan Wang
- School of Pharmaceutical Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, No. 9- 13 Wuguishan Avenue of Life Street, 528458, Zhongshan, China
| | - Huiyou Chang
- School of Data and Computer Science, Sun Yat-Sen University, No.132 Waihuan East Road, 510000 Guangzhou, China
| |
Collapse
|
142
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 161] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
143
|
Mongia A, Majumdar A. Drug-target interaction prediction using Multi Graph Regularized Nuclear Norm Minimization. PLoS One 2020; 15:e0226484. [PMID: 31945078 PMCID: PMC6964976 DOI: 10.1371/journal.pone.0226484] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 11/27/2019] [Indexed: 01/09/2023] Open
Abstract
The identification of potential interactions between drugs and target proteins is crucial in pharmaceutical sciences. The experimental validation of interactions in genomic drug discovery is laborious and expensive; hence, there is a need for efficient and accurate in-silico techniques which can predict potential drug-target interactions to narrow down the search space for experimental verification. In this work, we propose a new framework, namely, Multi-Graph Regularized Nuclear Norm Minimization, which predicts the interactions between drugs and target proteins from three inputs: known drug-target interaction network, similarities over drugs and those over targets. The proposed method focuses on finding a low-rank interaction matrix that is structured by the proximities of drugs and targets encoded by graphs. Previous works on Drug Target Interaction (DTI) prediction have shown that incorporating drug and target similarities helps in learning the data manifold better by preserving the local geometries of the original data. But, there is no clear consensus on which kind and what combination of similarities would best assist the prediction task. Hence, we propose to use various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs/targets) regularization terms to capture the proximities exhaustively. Extensive cross-validation experiments on four benchmark datasets using standard evaluation metrics (AUPR and AUC) show that the proposed algorithm improves the predictive performance and outperforms recent state-of-the-art computational methods by a large margin. Software is publicly available at https://github.com/aanchalMongia/MGRNNMforDTI.
Collapse
Affiliation(s)
- Aanchal Mongia
- Dept. of Computer Science and Engineering, IIIT-Delhi, Delhi, India
| | - Angshul Majumdar
- Dept. of Electronics and Communications Engineering, IIIT-Delhi, Delhi, India
| |
Collapse
|
144
|
Zhang YF, Wang X, Kaushik AC, Chu Y, Shan X, Zhao MZ, Xu Q, Wei DQ. SPVec: A Word2vec-Inspired Feature Representation Method for Drug-Target Interaction Prediction. Front Chem 2020; 7:895. [PMID: 31998687 PMCID: PMC6967417 DOI: 10.3389/fchem.2019.00895] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 12/12/2019] [Indexed: 11/13/2022] Open
Abstract
Drug discovery is an academical and commercial process of global importance. Accurate identification of drug-target interactions (DTIs) can significantly facilitate the drug discovery process. Compared to the costly, labor-intensive and time-consuming experimental methods, machine learning (ML) plays an ever-increasingly important role in effective, efficient and high-throughput identification of DTIs. However, upstream feature extraction methods require tremendous human resources and expert insights, which limits the application of ML approaches. Inspired by the unsupervised representation learning methods like Word2vec, we here proposed SPVec, a novel way to automatically represent raw data such as SMILES strings and protein sequences into continuous, information-rich and lower-dimensional vectors, so as to avoid the sparseness and bit collisions from the cumbersomely manually extracted features. Visualization of SPVec nicely illustrated that the similar compounds or proteins occupy similar vector space, which indicated that SPVec not only encodes compound substructures or protein sequences efficiently, but also implicitly reveals some important biophysical and biochemical patterns. Compared with manually-designed features like MACCS fingerprints and amino acid composition (AAC), SPVec showed better performance with several state-of-art machine learning classifiers such as Gradient Boosting Decision Tree, Random Forest and Deep Neural Network on BindingDB. The performance and robustness of SPVec were also confirmed on independent test sets obtained from DrugBank database. Also, based on the whole DrugBank dataset, we predicted the possibilities of all unlabeled DTIs, where two of the top five predicted novel DTIs were supported by external evidences. These results indicated that SPVec can provide an effective and efficient way to discover reliable DTIs, which would be beneficial for drug reprofiling.
Collapse
Affiliation(s)
- Yu-Fang Zhang
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Xiangeng Wang
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Aman Chandra Kaushik
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.,Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Xiaoqi Shan
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Ming-Zhu Zhao
- Instrumental Analysis Center, Shanghai Jiao Tong University, Shanghai, China
| | - Qin Xu
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, and SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.,Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
145
|
Abstract
Network-based approach is rapidly emerging as a promising strategy to integrate and interpret different -omics datasets, including metabolomics. The first section of this chapter introduces the current progresses and main concepts in multi-omics integration. The second section provides an overview of the public resources available for creation of biological networks. The third section describes three common application scenarios including subnetwork identification, network-based enrichment analysis, and systems metabolomics. The section four introduces the concept of hierarchical community network analysis. The section five discusses different tools for network visualization. The chapter ends with a future perspective on multi-omics integration.
Collapse
Affiliation(s)
- Guangyan Zhou
- Institute of Parasitology, McGill University, Montreal, QC, Canada
| | - Shuzhao Li
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, QC, Canada. .,Department of Animal Science, McGill University, Montreal, QC, Canada. .,Department of Microbiology and Immunology, McGill University, Montreal, QC, Canada. .,Department of Human Genetics, McGill University, Montreal, QC, Canada.
| |
Collapse
|
146
|
Zong N, Wong RSN, Yu Y, Wen A, Huang M, Li N. Drug-target prediction utilizing heterogeneous bio-linked network embeddings. Brief Bioinform 2019; 22:568-580. [PMID: 31885036 DOI: 10.1093/bib/bbz147] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 10/11/2019] [Accepted: 10/29/2019] [Indexed: 11/12/2022] Open
Abstract
To enable modularization for network-based prediction, we conducted a review of known methods conducting the various subtasks corresponding to the creation of a drug-target prediction framework and associated benchmarking to determine the highest-performing approaches. Accordingly, our contributions are as follows: (i) from a network perspective, we benchmarked the association-mining performance of 32 distinct subnetwork permutations, arranging based on a comprehensive heterogeneous biomedical network derived from 12 repositories; (ii) from a methodological perspective, we identified the best prediction strategy based on a review of combinations of the components with off-the-shelf classification, inference methods and graph embedding methods. Our benchmarking strategy consisted of two series of experiments, totaling six distinct tasks from the two perspectives, to determine the best prediction. We demonstrated that the proposed method outperformed the existing network-based methods as well as how combinatorial networks and methodologies can influence the prediction. In addition, we conducted disease-specific prediction tasks for 20 distinct diseases and showed the reliability of the strategy in predicting 75 novel drug-target associations as shown by a validation utilizing DrugBank 5.1.0. In particular, we revealed a connection of the network topology with the biological explanations for predicting the diseases, 'Asthma' 'Hypertension', and 'Dementia'. The results of our benchmarking produced knowledge on a network-based prediction framework with the modularization of the feature selection and association prediction, which can be easily adapted and extended to other feature sources or machine learning algorithms as well as a performed baseline to comprehensively evaluate the utility of incorporating varying data sources.
Collapse
Affiliation(s)
- Nansu Zong
- Department of Health Sciences Research, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Rachael Sze Nga Wong
- Department of Bioengineering, UC San Diego, 9500 Gilman Drive, San Diego, CA 92093-0412, USA
| | - Yue Yu
- Department of Health Sciences Research, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Andrew Wen
- Department of Health Sciences Research, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Ming Huang
- Department of Health Sciences Research, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Ning Li
- Scripps Research Institute, 10550 North Torrey Pines Road, San Diego, CA, 92037, USA
| |
Collapse
|
147
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
148
|
Wang R, Li S, Cheng L, Wong MH, Leung KS. Predicting associations among drugs, targets and diseases by tensor decomposition for drug repositioning. BMC Bioinformatics 2019; 20:628. [PMID: 31839008 PMCID: PMC6912989 DOI: 10.1186/s12859-019-3283-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Development of new drugs is a time-consuming and costly process, and the cost is still increasing in recent years. However, the number of drugs approved by FDA every year per dollar spent on development is declining. Drug repositioning, which aims to find new use of existing drugs, attracts attention of pharmaceutical researchers due to its high efficiency. A variety of computational methods for drug repositioning have been proposed based on machine learning approaches, network-based approaches, matrix decomposition approaches, etc. RESULTS: We propose a novel computational method for drug repositioning. We construct and decompose three-dimensional tensors, which consist of the associations among drugs, targets and diseases, to derive latent factors reflecting the functional patterns of the three kinds of entities. The proposed method outperforms several baseline methods in recovering missing associations. Most of the top predictions are validated by literature search and computational docking. Latent factors are used to cluster the drugs, targets and diseases into functional groups. Topological Data Analysis (TDA) is applied to investigate the properties of the clusters. We find that the latent factors are able to capture the functional patterns and underlying molecular mechanisms of drugs, targets and diseases. In addition, we focus on repurposing drugs for cancer and discover not only new therapeutic use but also adverse effects of the drugs. In the in-depth study of associations among the clusters of drugs, targets and cancer subtypes, we find there exist strong associations between particular clusters. CONCLUSIONS The proposed method is able to recover missing associations, discover new predictions and uncover functional clusters of drugs, targets and diseases. The clustering of drugs, targets and diseases, as well as the associations among the clusters, provides a new guiding framework for drug repositioning.
Collapse
Affiliation(s)
- Ran Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Shuai Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Lixin Cheng
- Department of Critical Care Medicine, Shenzhen People’s Hospital, The Second Clinical Medicine College of Ji’nan University, Shenzhen, China
| | - Man Hon Wong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Kwong Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
149
|
Li J, Li X, Feng X, Wang B, Zhao B, Wang L. A novel target convergence set based random walk with restart for prediction of potential LncRNA-disease associations. BMC Bioinformatics 2019; 20:626. [PMID: 31795943 PMCID: PMC6889579 DOI: 10.1186/s12859-019-3216-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 11/12/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND In recent years, lncRNAs (long-non-coding RNAs) have been proved to be closely related to the occurrence and development of many serious diseases that are seriously harmful to human health. However, most of the lncRNA-disease associations have not been found yet due to high costs and time complexity of traditional bio-experiments. Hence, it is quite urgent and necessary to establish efficient and reasonable computational models to predict potential associations between lncRNAs and diseases. RESULTS In this manuscript, a novel prediction model called TCSRWRLD is proposed to predict potential lncRNA-disease associations based on improved random walk with restart. In TCSRWRLD, a heterogeneous lncRNA-disease network is constructed first by combining the integrated similarity of lncRNAs and the integrated similarity of diseases. And then, for each lncRNA/disease node in the newly constructed heterogeneous lncRNA-disease network, it will establish a node set called TCS (Target Convergence Set) consisting of top 100 disease/lncRNA nodes with minimum average network distances to these disease/lncRNA nodes having known associations with itself. Finally, an improved random walk with restart is implemented on the heterogeneous lncRNA-disease network to infer potential lncRNA-disease associations. The major contribution of this manuscript lies in the introduction of the concept of TCS, based on which, the velocity of convergence of TCSRWRLD can be quicken effectively, since the walker can stop its random walk while the walking probability vectors obtained by it at the nodes in TCS instead of all nodes in the whole network have reached stable state. And Simulation results show that TCSRWRLD can achieve a reliable AUC of 0.8712 in the Leave-One-Out Cross Validation (LOOCV), which outperforms previous state-of-the-art results apparently. Moreover, case studies of lung cancer and leukemia demonstrate the satisfactory prediction performance of TCSRWRLD as well. CONCLUSIONS Both comparative results and case studies have demonstrated that TCSRWRLD can achieve excellent performances in prediction of potential lncRNA-disease associations, which imply as well that TCSRWRLD may be a good addition to the research of bioinformatics in the future.
Collapse
Affiliation(s)
- Jiechen Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Xueyong Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Anhui, 243002, Maanshan, People's Republic of China
| | - Bihai Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China. .,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China.
| |
Collapse
|
150
|
Zhang W, Lin W, Zhang D, Wang S, Shi J, Niu Y. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Curr Drug Metab 2019; 20:194-202. [PMID: 30129407 DOI: 10.2174/1389200219666180821094047] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/18/2018] [Accepted: 03/19/2018] [Indexed: 12/28/2022]
Abstract
BACKGROUND The identification of drug-target interactions is a crucial issue in drug discovery. In recent years, researchers have made great efforts on the drug-target interaction predictions, and developed databases, software and computational methods. RESULTS In the paper, we review the recent advances in machine learning-based drug-target interaction prediction. First, we briefly introduce the datasets and data, and summarize features for drugs and targets which can be extracted from different data. Since drug-drug similarity and target-target similarity are important for many machine learning prediction models, we introduce how to calculate similarities based on data or features. Different machine learningbased drug-target interaction prediction methods can be proposed by using different features or information. Thus, we summarize, analyze and compare different machine learning-based prediction methods. CONCLUSION This study provides the guide to the development of computational methods for the drug-target interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiran Lin
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Ding Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Siman Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan 430074, China
| |
Collapse
|