1
|
Zhao W, Yu Y, Liu G, Liang Y, Xu D, Feng X, Guan R. MSI-DTI: predicting drug-target interaction based on multi-source information and multi-head self-attention. Brief Bioinform 2024; 25:bbae238. [PMID: 38762789 PMCID: PMC11102638 DOI: 10.1093/bib/bbae238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/09/2024] [Accepted: 05/03/2024] [Indexed: 05/20/2024] Open
Abstract
Identifying drug-target interactions (DTIs) holds significant importance in drug discovery and development, playing a crucial role in various areas such as virtual screening, drug repurposing and identification of potential drug side effects. However, existing methods commonly exploit only a single type of feature from drugs and targets, suffering from miscellaneous challenges such as high sparsity and cold-start problems. We propose a novel framework called MSI-DTI (Multi-Source Information-based Drug-Target Interaction Prediction) to enhance prediction performance, which obtains feature representations from different views by integrating biometric features and knowledge graph representations from multi-source information. Our approach involves constructing a Drug-Target Knowledge Graph (DTKG), obtaining multiple feature representations from diverse information sources for SMILES sequences and amino acid sequences, incorporating network features from DTKG and performing an effective multi-source information fusion. Subsequently, we employ a multi-head self-attention mechanism coupled with residual connections to capture higher-order interaction information between sparse features while preserving lower-order information. Experimental results on DTKG and two benchmark datasets demonstrate that our MSI-DTI outperforms several state-of-the-art DTIs prediction methods, yielding more accurate and robust predictions. The source codes and datasets are publicly accessible at https://github.com/KEAML-JLU/MSI-DTI.
Collapse
Affiliation(s)
- Wenchuan Zhao
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Yufeng Yu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Guosheng Liu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Yanchun Liang
- Zhuhai Laboratory of the Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, Zhuhai College of Science and Technology, Zhuhai 519041, China
| | - Dong Xu
- Department of Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Xiaoyue Feng
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Renchu Guan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| |
Collapse
|
2
|
B S N, P K KN, Akey KS, Sankaran S, Raman RK, Natarajan J, Selvaraj J. Vitamin D analog calcitriol for breast cancer therapy; an integrated drug discovery approach. J Biomol Struct Dyn 2023; 41:11017-11043. [PMID: 37054526 DOI: 10.1080/07391102.2023.2199866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 12/11/2022] [Indexed: 04/15/2023]
Abstract
As breast cancer remains leading cause of cancer death globally, it is essential to develop an affordable breast cancer therapy in underdeveloped countries. Drug repurposing offers potential to address gaps in breast cancer treatment. Molecular networking studies were performed for drug repurposing approach by using heterogeneous data. The PPI networks were built to select the target genes from the EGFR overexpression signaling pathway and its associated family members. The selected genes EGFR, ErbB2, ErbB4 and ErbB3 were allowed to interact with 2637 drugs, leads to PDI network construction of 78, 61, 15 and 19 drugs, respectively. As drugs approved for treating non cancer-related diseases or disorders are clinically safe, effective, and affordable, these drugs were given considerable attention. Calcitriol had shown significant binding affinities with all four receptors than standard neratinib. The RMSD, RMSF, and H-bond analysis of protein-ligand complexes from molecular dynamics simulation (100 ns), confirmed the stable binding of calcitriol with ErbB2 and EGFR receptors. In addition, MMGBSA and MMP BSA also affirmed the docking results. These in-silico results were validated with in-vitro cytotoxicity studies in SK-BR-3 and Vero cells. The IC50 value of calcitriol (43.07 mg/ml) was found to be lower than neratinib (61.50 mg/ml) in SK-BR-3 cells. In Vero cells the IC50 value of calcitriol (431.05 mg/ml) was higher than neratinib (404.95 mg/ml). It demonstrates that calcitriol suggestively downregulated the SK-BR-3 cell viability in a dose-dependent manner. These implications revealed calcitriol has shown better cytotoxicity and decreased the proliferation rate of breast cancer cells than neratinib.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Nagaraj B S
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Krishnan Namboori P K
- Amrita Molecular Modeling and Synthesis (AMMAS) Research lab, Amrita Vishwavidyapeetham, Coimbatore, Tamilnadu, India
| | - Krishna Swaroop Akey
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Sathianarayanan Sankaran
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Karpagam Academy of Higher Education, Coimbatore, Tamilnadu, India
| | - Rajesh Kumar Raman
- Department of Pharmaceutical Biotechnology, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Jawahar Natarajan
- Department of Pharmaceutics, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| | - Jubie Selvaraj
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education and Research, Ooty, Tamilnadu, India
| |
Collapse
|
3
|
Su Y, Hu Z, Wang F, Bin Y, Zheng C, Li H, Chen H, Zeng X. AMGDTI: drug-target interaction prediction based on adaptive meta-graph learning in heterogeneous network. Brief Bioinform 2023; 25:bbad474. [PMID: 38145949 PMCID: PMC10749791 DOI: 10.1093/bib/bbad474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 11/10/2023] [Accepted: 11/30/2023] [Indexed: 12/27/2023] Open
Abstract
Prediction of drug-target interactions (DTIs) is essential in medicine field, since it benefits the identification of molecular structures potentially interacting with drugs and facilitates the discovery and reposition of drugs. Recently, much attention has been attracted to network representation learning to learn rich information from heterogeneous data. Although network representation learning algorithms have achieved success in predicting DTI, several manually designed meta-graphs limit the capability of extracting complex semantic information. To address the problem, we introduce an adaptive meta-graph-based method, termed AMGDTI, for DTI prediction. In the proposed AMGDTI, the semantic information is automatically aggregated from a heterogeneous network by training an adaptive meta-graph, thereby achieving efficient information integration without requiring domain knowledge. The effectiveness of the proposed AMGDTI is verified on two benchmark datasets. Experimental results demonstrate that the AMGDTI method overall outperforms eight state-of-the-art methods in predicting DTI and achieves the accurate identification of novel DTIs. It is also verified that the adaptive meta-graph exhibits flexibility and effectively captures complex fine-grained semantic information, enabling the learning of intricate heterogeneous network topology and the inference of potential drug-target relationship.
Collapse
Affiliation(s)
- Yansen Su
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Zhiyang Hu
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Fei Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Yannan Bin
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Chunhou Zheng
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Haitao Li
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Hunan, 410082, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Hunan, 410082, China
| |
Collapse
|
4
|
Yan X, Liu Y. Graph-sequence attention and transformer for predicting drug-target affinity. RSC Adv 2022; 12:29525-29534. [PMID: 36320763 PMCID: PMC9562047 DOI: 10.1039/d2ra05566j] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 10/04/2022] [Indexed: 11/30/2022] Open
Abstract
Drug-target binding affinity (DTA) prediction has drawn increasing interest due to its substantial position in the drug discovery process. The development of new drugs is costly, time-consuming, and often accompanied by safety issues. Drug repurposing can avoid the expensive and lengthy process of drug development by finding new uses for already approved drugs. Therefore, it is of great significance to develop effective computational methods to predict DTAs. The attention mechanisms allow the computational method to focus on the most relevant parts of the input and have been proven to be useful for various tasks. In this study, we proposed a novel model based on self-attention, called GSATDTA, to predict the binding affinity between drugs and targets. For the representation of drugs, we use Bi-directional Gated Recurrent Units (BiGRU) to extract the SMILES representation from SMILES sequences, and graph neural networks to extract the graph representation of the molecular graphs. Then we utilize an attention mechanism to fuse the two representations of the drug. For the target/protein, we utilized an efficient transformer to learn the representation of the protein, which can capture the long-distance relationships in the sequence of amino acids. We conduct extensive experiments to compare our model with state-of-the-art models. Experimental results show that our model outperforms the current state-of-the-art methods on two independent datasets.
Collapse
Affiliation(s)
- Xiangfeng Yan
- School of Computer Science and Technology, Heilongjiang UniversityHarbinChina
| | - Yong Liu
- School of Computer Science and Technology, Heilongjiang UniversityHarbinChina
| |
Collapse
|
5
|
Wang H, Huang F, Xiong Z, Zhang W. A heterogeneous network-based method with attentive meta-path extraction for predicting drug-target interactions. Brief Bioinform 2022; 23:6596318. [PMID: 35641162 DOI: 10.1093/bib/bbac184] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/09/2022] [Accepted: 04/23/2022] [Indexed: 11/13/2022] Open
Abstract
Predicting drug-target interactions (DTIs) is crucial at many phases of drug discovery and repositioning. Many computational methods based on heterogeneous networks (HNs) have proved their potential to predict DTIs by capturing extensive biological knowledge and semantic information from meta-paths. However, existing methods manually customize meta-paths, which is overly dependent on some specific expertise. Such strategy heavily limits the scalability and flexibility of these models, and even affects their predictive performance. To alleviate this limitation, we propose a novel HN-based method with attentive meta-path extraction for DTI prediction, named HampDTI, which is capable of automatically extracting useful meta-paths through a learnable attention mechanism instead of pre-definition based on domain knowledge. Specifically, by scoring multi-hop connections across various relations in the HN with each relation assigned an attention weight, HampDTI constructs a new trainable graph structure, called meta-path graph. Such meta-path graph implicitly measures the importance of every possible meta-path between drugs and targets. To enable HampDTI to extract more diverse meta-paths, we adopt a multi-channel mechanism to generate multiple meta-path graphs. Then, a graph neural network is deployed on the generated meta-path graphs to yield the multi-channel embeddings of drugs and targets. Finally, HampDTI fuses all embeddings from different channels for predicting DTIs. The meta-path graphs are optimized along with the model training such that HampDTI can adaptively extract valuable meta-paths for DTI prediction. The experiments on benchmark datasets not only show the superiority of HampDTI in DTI prediction over several baseline methods, but also, more importantly, demonstrate the effectiveness of the model discovering important meta-paths.
Collapse
Affiliation(s)
- Hongzhun Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Zhankun Xiong
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| |
Collapse
|
6
|
Li J, Wang J, Lv H, Zhang Z, Wang Z. IMCHGAN: Inductive Matrix Completion With Heterogeneous Graph Attention Networks for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:655-665. [PMID: 34115592 DOI: 10.1109/tcbb.2021.3088614] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identification of targets among known drugs plays an important role in drug repurposing and discovery. Computational approaches for prediction of drug-target interactions (DTIs)are highly desired in comparison to traditional biological experiments as its fast and low price. Moreover, recent advances of systems biology approaches have generated large-scale heterogeneous, biological information networks data, which offer opportunities for machine learning-based identification of DTIs. We present a novel Inductive Matrix Completion with Heterogeneous Graph Attention Network approach (IMCHGAN)for predicting DTIs. IMCHGAN first adopts a two-level neural attention mechanism approach to learn drug and target latent feature representations from the DTI heterogeneous network respectively. Then, the learned latent features are fed into the Inductive Matrix Completion (IMC)prediction score model which computes the best projection from drug space onto target space and output DTI score via the inner product of projected drug and target feature representations. IMCHGAN is an end-to-end neural network learning framework where the parameters of both the prediction score model and the feature representation learning model are simultaneously optimized via backpropagation under supervising of the observed known drug-target interactions data. We compare IMCHGAN with other state-of-the-art baselines on two real DTI experimental datasets. The results show that our method is superior to existing methods in term of AUC and AUPR. Moreover, IMCHGAN also shows it has strong predictive power for novel (unknown)DTIs. All datasets and code can be obtained from https://github.com/ljatynu/IMCHGAN/.
Collapse
|
7
|
Yan XY, Yin PW, Wu XM, Han JX. Prediction of the Drug-Drug Interaction Types with the Unified Embedding Features from Drug Similarity Networks. Front Pharmacol 2022; 12:794205. [PMID: 34987405 PMCID: PMC8721167 DOI: 10.3389/fphar.2021.794205] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/04/2021] [Indexed: 12/12/2022] Open
Abstract
Drug combination therapies are a promising strategy to overcome drug resistance and improve the efficacy of monotherapy in cancer, and it has been shown to lead to a decrease in dose-related toxicities. Except the synergistic reaction between drugs, some antagonistic drug-drug interactions (DDIs) exist, which is the main cause of adverse drug events. Precisely predicting the type of DDI is important for both drug development and more effective drug combination therapy applications. Recently, numerous text mining- and machine learning-based methods have been developed for predicting DDIs. All these methods implicitly utilize the feature of drugs from diverse drug-related properties. However, how to integrate these features more efficiently and improve the accuracy of classification is still a challenge. In this paper, we proposed a novel method (called NMDADNN) to predict the DDI types by integrating five drug-related heterogeneous information sources to extract the unified drug mapping features. NMDADNN first constructs the similarity networks by using the Jaccard coefficient and then implements random walk with restart algorithm and positive pointwise mutual information for extracting the topological similarities. After that, five network-based similarities are unified by using a multimodel deep autoencoder. Finally, NMDADNN implements the deep neural network (DNN) on the unified drug feature to infer the types of DDIs. In comparison with other recent state-of-the-art DNN-based methods, NMDADNN achieves the best results in terms of accuracy, area under the precision-recall curve, area under the ROC curve, F1 score, precision and recall. In addition, many of the promising types of drug-drug pairs predicted by NMDADNN are also confirmed by using the interactions checker tool. These results demonstrate the effectiveness of our NMDADNN method, indicating that NMDADNN has the great potential for predicting DDI types.
Collapse
Affiliation(s)
- Xiao-Ying Yan
- College of Computer Science, Xi'an Shiyou University, Xi'an, China
| | - Peng-Wei Yin
- College of Computer Science, Xi'an Shiyou University, Xi'an, China
| | - Xiao-Meng Wu
- School of Electronic Engineering, Xi'an Shiyou University, Xi'an, China
| | - Jia-Xin Han
- College of Computer Science, Xi'an Shiyou University, Xi'an, China
| |
Collapse
|
8
|
Drug-Target Interaction Prediction Based on Multisource Information Weighted Fusion. CONTRAST MEDIA & MOLECULAR IMAGING 2021; 2021:6044256. [PMID: 34908912 PMCID: PMC8635946 DOI: 10.1155/2021/6044256] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 10/22/2021] [Indexed: 01/08/2023]
Abstract
Recently, in most existing studies, it is assumed that there are no interaction relationships between drugs and targets with unknown interactions. However, unknown interactions mean the relationships between drugs and targets have just not been confirmed. In this paper, samples for which the relationship between drugs and targets has not been determined are considered unlabeled. A weighted fusion method of multisource information is proposed to screen drug-target interactions. Firstly, some drug-target pairs which may have interactions are selected. Secondly, the selected drug-target pairs are added to the positive samples, which are regarded as known to have interaction relationships, and the original interaction relationship matrix is revised. Finally, the revised datasets are used to predict the interaction derived from the bipartite local model with neighbor-based interaction profile inferring (BLM-NII). Experiments demonstrate that the proposed method has greatly improved specificity, sensitivity, precision, and accuracy compared with the BLM-NII method. In addition, compared with several state-of-the-art methods, the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUPR) of the proposed method are excellent.
Collapse
|
9
|
Guo Y, Hou L, Zhu W, Wang P. Prediction of Hormone-Binding Proteins Based on K-mer Feature Representation and Naive Bayes. Front Genet 2021; 12:797641. [PMID: 34887905 PMCID: PMC8650314 DOI: 10.3389/fgene.2021.797641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 11/05/2021] [Indexed: 11/29/2022] Open
Abstract
Hormone binding protein (HBP) is a soluble carrier protein that interacts selectively with different types of hormones and has various effects on the body's life activities. HBPs play an important role in the growth process of organisms, but their specific role is still unclear. Therefore, correctly identifying HBPs is the first step towards understanding and studying their biological function. However, due to their high cost and long experimental period, it is difficult for traditional biochemical experiments to correctly identify HBPs from an increasing number of proteins, so the real characterization of HBPs has become a challenging task for researchers. To measure the effectiveness of HBPs, an accurate and reliable prediction model for their identification is desirable. In this paper, we construct the prediction model HBP_NB. First, HBPs data were collected from the UniProt database, and a dataset was established. Then, based on the established high-quality dataset, the k-mer (K = 3) feature representation method was used to extract features. Second, the feature selection algorithm was used to reduce the dimensionality of the extracted features and select the appropriate optimal feature set. Finally, the selected features are input into Naive Bayes to construct the prediction model, and the model is evaluated by using 10-fold cross-validation. The final results were 95.45% accuracy, 94.17% sensitivity and 96.73% specificity. These results indicate that our model is feasible and effective.
Collapse
Affiliation(s)
- Yuxin Guo
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Yangtze Delta Region Institute, University of Electronic Science and Technology of China, Quzhou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Liping Hou
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Wen Zhu
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Peng Wang
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, China
- Key Laboratory of Data Science and Intelligence Education, Hainan Normal University, Ministry of Education, Haikou, China
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
10
|
Ding P, Ouyang W, Luo J, Kwoh CK. Heterogeneous information network and its application to human health and disease. Brief Bioinform 2021; 21:1327-1346. [PMID: 31566212 DOI: 10.1093/bib/bbz091] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/29/2019] [Accepted: 06/30/2019] [Indexed: 12/11/2022] Open
Abstract
The molecular components with the functional interdependencies in human cell form complicated biological network. Diseases are mostly caused by the perturbations of the composite of the interaction multi-biomolecules, rather than an abnormality of a single biomolecule. Furthermore, new biological functions and processes could be revealed by discovering novel biological entity relationships. Hence, more and more biologists focus on studying the complex biological system instead of the individual biological components. The emergence of heterogeneous information network (HIN) offers a promising way to systematically explore complicated and heterogeneous relationships between various molecules for apparently distinct phenotypes. In this review, we first present the basic definition of HIN and the biological system considered as a complex HIN. Then, we discuss the topological properties of HIN and how these can be applied to detect network motif and functional module. Afterwards, methodologies of discovering relationships between disease and biomolecule are presented. Useful insights on how HIN aids in drug development and explores human interactome are provided. Finally, we analyze the challenges and opportunities for uncovering combinatorial patterns among pharmacogenomics and cell-type detection based on single-cell genomic data.
Collapse
Affiliation(s)
- Pingjian Ding
- School of Computer Science, University of South China, Hengyang, China
| | - Wenjue Ouyang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
11
|
Vatansever S, Schlessinger A, Wacker D, Kaniskan HÜ, Jin J, Zhou M, Zhang B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med Res Rev 2021; 41:1427-1473. [PMID: 33295676 PMCID: PMC8043990 DOI: 10.1002/med.21764] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/20/2020] [Indexed: 01/11/2023]
Abstract
Neurological disorders significantly outnumber diseases in other therapeutic areas. However, developing drugs for central nervous system (CNS) disorders remains the most challenging area in drug discovery, accompanied with the long timelines and high attrition rates. With the rapid growth of biomedical data enabled by advanced experimental technologies, artificial intelligence (AI) and machine learning (ML) have emerged as an indispensable tool to draw meaningful insights and improve decision making in drug discovery. Thanks to the advancements in AI and ML algorithms, now the AI/ML-driven solutions have an unprecedented potential to accelerate the process of CNS drug discovery with better success rate. In this review, we comprehensively summarize AI/ML-powered pharmaceutical discovery efforts and their implementations in the CNS area. After introducing the AI/ML models as well as the conceptualization and data preparation, we outline the applications of AI/ML technologies to several key procedures in drug discovery, including target identification, compound screening, hit/lead generation and optimization, drug response and synergy prediction, de novo drug design, and drug repurposing. We review the current state-of-the-art of AI/ML-guided CNS drug discovery, focusing on blood-brain barrier permeability prediction and implementation into therapeutic discovery for neurological diseases. Finally, we discuss the major challenges and limitations of current approaches and possible future directions that may provide resolutions to these difficulties.
Collapse
Affiliation(s)
- Sezen Vatansever
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Avner Schlessinger
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Daniel Wacker
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of NeuroscienceIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - H. Ümit Kaniskan
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Jian Jin
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Ming‐Ming Zhou
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Bin Zhang
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| |
Collapse
|
12
|
Wang A, Wang M. Drug-Target Interaction Prediction via Dual Laplacian Graph Regularized Logistic Matrix Factorization. BIOMED RESEARCH INTERNATIONAL 2021; 2021:5599263. [PMID: 33855072 PMCID: PMC8019634 DOI: 10.1155/2021/5599263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 03/06/2021] [Accepted: 03/13/2021] [Indexed: 11/18/2022]
Abstract
Drug-target interactions provide useful information for biomedical drug discovery as well as drug development. However, it is costly and time consuming to find drug-target interactions by experimental methods. As a result, developing computational approaches for this task is necessary and has practical significance. In this study, we establish a novel dual Laplacian graph regularized logistic matrix factorization model for drug-target interaction prediction, referred to as DLGrLMF briefly. Specifically, DLGrLMF regards the task of drug-target interaction prediction as a weighted logistic matrix factorization problem, in which the experimentally validated interactions are allocated with larger weights. Meanwhile, by considering that drugs with similar chemical structure should have interactions with similar targets and targets with similar genomic sequence similarity should in turn have interactions with similar drugs, the drug pairwise chemical structure similarities as well as the target pairwise genomic sequence similarities are fully exploited to serve the matrix factorization problem by using a dual Laplacian graph regularization term. In addition, we design a gradient descent algorithm to solve the resultant optimization problem. Finally, the efficacy of DLGrLMF is validated on various benchmark datasets and the experimental results demonstrate that DLGrLMF performs better than other state-of-the-art methods. Case studies are also conducted to validate that DLGrLMF can successfully predict most of the experimental validated drug-target interactions.
Collapse
Affiliation(s)
- Aizhen Wang
- Department of Pharmacy, The Affiliated Huai'an Hospital of Xuzhou Medical University and The Second People's Hospital of Huai'an, Huai'an 223002, China
| | - Minhui Wang
- Department of Pharmacy, Lianshui People's Hospital Affiliated to Kangda College, Nanjing Medical University, Huai'an 223300, China
| |
Collapse
|
13
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
14
|
Abstract
Background:
At present, using computer methods to predict drug-target interactions
(DTIs) is a very important step in the discovery of new drugs and drug relocation
processes. The potential DTIs identified by machine learning methods can provide guidance
in biochemical or clinical experiments.
Objective:
The goal of this article is to combine the latest network representation learning
methods for drug-target prediction research, improve model prediction capabilities, and
promote new drug development.
Methods:
We use large-scale information network embedding (LINE) method to extract
network topology features of drugs, targets, diseases, etc., integrate features obtained
from heterogeneous networks, construct binary classification samples, and use random
forest (RF) method to predict DTIs.
Results:
The experiments in this paper compare the common classifiers of RF, LR, and
SVM, as well as the typical network representation learning methods of LINE,
Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves
the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016.
Conclusion:
The learning method based on LINE network can effectively learn drugs,
targets, diseases and other hidden features from the network topology. The combination
of features learned through multiple networks can enhance the expression ability. RF is an
effective method of supervised learning. Therefore, the Line-RF combination method is a
widely applicable method.
Collapse
Affiliation(s)
- Jihong Wang
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Yue Shi
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Xiaodan Wang
- School of Pharmaceutical Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Zhongshan, Guangdong, China
| | - Huiyou Chang
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, Guangdong, China
| |
Collapse
|
15
|
Huang L, Luo H, Li S, Wu FX, Wang J. Drug-drug similarity measure and its applications. Brief Bioinform 2020; 22:5956929. [PMID: 33152756 DOI: 10.1093/bib/bbaa265] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 02/01/2023] Open
Abstract
Drug similarities play an important role in modern biology and medicine, as they help scientists gain deep insights into drugs' therapeutic mechanisms and conduct wet labs that may significantly improve the efficiency of drug research and development. Nowadays, a number of drug-related databases have been constructed, with which many methods have been developed for computing similarities between drugs for studying associations between drugs, human diseases, proteins (drug targets) and more. In this review, firstly, we briefly introduce the publicly available drug-related databases. Secondly, based on different drug features, interaction relationships and multimodal data, we summarize similarity calculation methods in details. Then, we discuss the applications of drug similarities in various biological and medical areas. Finally, we evaluate drug similarity calculation methods with common evaluation metrics to illustrate the important roles of drug similarity measures on different applications.
Collapse
Affiliation(s)
- Lan Huang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Huimin Luo
- School of Computer and Information Engineering at Henan University, Kaifeng, China
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
16
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
17
|
Wang C, Wang W, Lu K, Zhang J, Chen P, Wang B. Predicting Drug-Target Interactions with Electrotopological State Fingerprints and Amphiphilic Pseudo Amino Acid Composition. Int J Mol Sci 2020; 21:ijms21165694. [PMID: 32784497 PMCID: PMC7570185 DOI: 10.3390/ijms21165694] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 12/13/2022] Open
Abstract
The task of drug-target interaction (DTI) prediction plays important roles in drug development. The experimental methods in DTIs are time-consuming, expensive and challenging. To solve these problems, machine learning-based methods are introduced, which are restricted by effective feature extraction and negative sampling. In this work, features with electrotopological state (E-state) fingerprints for drugs and amphiphilic pseudo amino acid composition (APAAC) for target proteins are tested. E-state fingerprints are extracted based on both molecular electronic and topological features with the same metric. APAAC is an extension of amino acid composition (AAC), which is calculated based on hydrophilic and hydrophobic characters to construct sequence order information. Using the combination of these feature pairs, the prediction model is established by support vector machines. In order to enhance the effectiveness of features, a distance-based negative sampling is proposed to obtain reliable negative samples. It is shown that the prediction results of area under curve for Receiver Operating Characteristic (AUC) are above 98.5% for all the three datasets in this work. The comparison of state-of-the-art methods demonstrates the effectiveness and efficiency of proposed method, which will be helpful for further drug development.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Computer Science & Technology, Tongji University, Shanghai 201804, China;
| | - Wenyan Wang
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
- Key Laboratory of Power Electronics and Motion Control Anhui Education Department, Ma’anshan 243032, China
| | - Kun Lu
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
| | - Jun Zhang
- Institutes of Physical Science and Information Technology & School of Internet, Anhui University, Hefei 230601, China;
| | - Peng Chen
- Institutes of Physical Science and Information Technology & School of Internet, Anhui University, Hefei 230601, China;
- Correspondence: (P.C.); (B.W.)
| | - Bing Wang
- Department of Computer Science & Technology, Tongji University, Shanghai 201804, China;
- School of Electrical & Information Engineering, Anhui University of Technology, Ma’anshan 243002, China; (W.W.); (K.L.)
- Key Laboratory of Power Electronics and Motion Control Anhui Education Department, Ma’anshan 243032, China
- Correspondence: (P.C.); (B.W.)
| |
Collapse
|
18
|
A computational drug repositioning method applied to rare diseases: Adrenocortical carcinoma. Sci Rep 2020; 10:8846. [PMID: 32483162 PMCID: PMC7264316 DOI: 10.1038/s41598-020-65658-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 05/08/2020] [Indexed: 01/12/2023] Open
Abstract
Rare or orphan diseases affect only small populations, thereby limiting the economic incentive for the drug development process, often resulting in a lack of progress towards treatment. Drug repositioning is a promising approach in these cases, due to its low cost. In this approach, one attempts to identify new purposes for existing drugs that have already been developed and approved for use. By applying the process of drug repositioning to identify novel treatments for rare diseases, we can overcome the lack of economic incentives and make concrete progress towards new therapies. Adrenocortical Carcinoma (ACC) is a rare disease with no practical and definitive therapeutic approach. We apply Heter-LP, a new method of drug repositioning, to suggest novel therapeutic avenues for ACC. Our analysis identifies innovative putative drug-disease, drug-target, and disease-target relationships for ACC, which include Cosyntropin (drug) and DHCR7, IGF1R, MC1R, MAP3K3, TOP2A (protein targets). When results are analyzed using all available information, a number of novel predicted associations related to ACC appear to be valid according to current knowledge. We expect the predicted relations will be useful for drug repositioning in ACC since the resulting ranked lists of drugs and protein targets can be used to expedite the necessary clinical processes.
Collapse
|
19
|
Parisi D, Adasme MF, Sveshnikova A, Bolz SN, Moreau Y, Schroeder M. Drug repositioning or target repositioning: A structural perspective of drug-target-indication relationship for available repurposed drugs. Comput Struct Biotechnol J 2020; 18:1043-1055. [PMID: 32419905 PMCID: PMC7215100 DOI: 10.1016/j.csbj.2020.04.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 03/31/2020] [Accepted: 04/04/2020] [Indexed: 12/18/2022] Open
Abstract
Drug repositioning aims to find new indications for existing drugs in order to reduce drug development cost and time. Currently,there are numerous stories of successful drug repositioning that have been reported and many repurposed drugs are already available on the market. Although drug repositioning is often a product of serendipity, repositioning opportunities can be uncovered systematically. There are three systematic approaches to drug repositioning: disease-centric approach, target-centric and drug-centric. Disease-centric approaches identify close relationships between an old and a new indication. A target-centric approach links a known target and its established drug to a new indication. Lastly, a drug-centric approach connects a known drug to a new target and its associated indication. These three approaches differ in their potential and their limitations, but above all else, in the required start information and computing power. This raises the question of which approach prevails in current drug discovery and what that implies for future developments. To address this question, we systematically evaluated over 100 drugs, 200 target structures and over 300 indications from the Drug Repositioning Database. Each analyzed case was classified as one of the three repositioning approaches. For the majority of cases (more than 60%) the disease-centric definition was assigned. Almost 30% of the cases were classified as target-centric and less than 10% as drug-centric approaches. We concluded that, despite the use of umbrella term “drug” repositioning, disease- and target-centric approaches have dominated the field until now. We propose the use of drug-centric approaches while discussing reasons, such as structure-based repositioning techniques, to exploit the full potential of drug-target-disease connections.
Collapse
Affiliation(s)
| | - Melissa F Adasme
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| | - Anastasia Sveshnikova
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| | | | - Yves Moreau
- ESAT-STADIUS, KU Leuven, B-3001 Heverlee, Belgium
| | - Michael Schroeder
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany
| |
Collapse
|
20
|
Redkar S, Mondal S, Joseph A, Hareesha KS. A Machine Learning Approach for Drug-target Interaction Prediction using Wrapper Feature Selection and Class Balancing. Mol Inform 2020; 39:e1900062. [PMID: 32003548 DOI: 10.1002/minf.201900062] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 01/28/2020] [Indexed: 01/19/2023]
Abstract
Drug-Target interaction (DTI) plays a crucial role in drug discovery, drug repositioning and understanding the drug side effects which helps to identify new therapeutic profiles for various diseases. However, the exponential growth in the genomic and drugs data makes it difficult to identify the new associations between drugs and targets. Therefore, we use computational methods as it helps in accelerating the DTI identification process. Usually, available data driven sources consisting of known DTI is used to train the classifier to predict the new DTIs. Such datasets often face the problem of class imbalance. Therefore, in this study we address two challenges faced by such datasets, i. e., class imbalance and high dimensionality to develop a predictive model for DTI prediction. The study is carried out on four protein classes namely Enzyme, Ion Channel, G Protein-Coupled Receptor (GPCR) and Nuclear Receptor. We encoded the target protein sequence using the dipeptide composition and drug with a molecular descriptor. A machine learning approach is employed to predict the DTI using wrapper feature selection and synthetic minority oversampling technique (SMOTE). The ensemble approach achieved at the best an accuracy of 95.9 %, 93.4 %, 90.8 % and 90.6 % and 96.3 %, 92.8 %, 90.1 %, and 90.2 % of precision on Enzyme, Ion Channel, GPCR and Nuclear Receptor datasets, respectively, when evaluated excluding SMOTE samples with 10-fold cross validation. Furthermore, our method could predict new drug-target interactions not contained in training dataset. Selected features using wrapper feature selection may be important to understand the DTI for the protein categories under this study. Based on our evaluation, the proposed method can be used for understanding and identifying new drug-target interactions. We provide the readers with a standalone package available at https://github.com/shwetagithub1/predDTI which will be able to provide the DTI predictions to user for new query DTI pairs.
Collapse
Affiliation(s)
- Shweta Redkar
- Department of Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India
| | - Sukanta Mondal
- Department of Biological Sciences, Birla Institute of Technology and Science-Pilani, K.K.Birla Goa Campus, 403726, Zuarinagar, Goa, -India
| | - Alex Joseph
- Department of Pharmaceutical Chemistry, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India
| | - K S Hareesha
- Department of Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India
| |
Collapse
|
21
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
22
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
23
|
Zhang W, Lin W, Zhang D, Wang S, Shi J, Niu Y. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Curr Drug Metab 2019; 20:194-202. [PMID: 30129407 DOI: 10.2174/1389200219666180821094047] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/18/2018] [Accepted: 03/19/2018] [Indexed: 12/28/2022]
Abstract
BACKGROUND The identification of drug-target interactions is a crucial issue in drug discovery. In recent years, researchers have made great efforts on the drug-target interaction predictions, and developed databases, software and computational methods. RESULTS In the paper, we review the recent advances in machine learning-based drug-target interaction prediction. First, we briefly introduce the datasets and data, and summarize features for drugs and targets which can be extracted from different data. Since drug-drug similarity and target-target similarity are important for many machine learning prediction models, we introduce how to calculate similarities based on data or features. Different machine learningbased drug-target interaction prediction methods can be proposed by using different features or information. Thus, we summarize, analyze and compare different machine learning-based prediction methods. CONCLUSION This study provides the guide to the development of computational methods for the drug-target interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiran Lin
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Ding Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Siman Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan 430074, China
| |
Collapse
|
24
|
Kumar R, Harilal S, Gupta SV, Jose J, Thomas Parambi DG, Uddin MS, Shah MA, Mathew B. Exploring the new horizons of drug repurposing: A vital tool for turning hard work into smart work. Eur J Med Chem 2019; 182:111602. [PMID: 31421629 PMCID: PMC7127402 DOI: 10.1016/j.ejmech.2019.111602] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 08/07/2019] [Indexed: 02/07/2023]
Abstract
Drug discovery and development are long and financially taxing processes. On an average it takes 12-15 years and costs 1.2 billion USD for successful drug discovery and approval for clinical use. Many lead molecules are not developed further and their potential is not tapped to the fullest due to lack of resources or time constraints. In order for a drug to be approved by FDA for clinical use, it must have excellent therapeutic potential in the desired area of target with minimal toxicities as supported by both pre-clinical and clinical studies. The targeted clinical evaluations fail to explore other potential therapeutic applications of the candidate drug. Drug repurposing or repositioning is a fast and relatively cheap alternative to the lengthy and expensive de novo drug discovery and development. Drug repositioning utilizes the already available clinical trials data for toxicity and adverse effects, at the same time explores the drug's therapeutic potential for a different disease. This review addresses recent developments and future scope of drug repositioning strategy.
Collapse
Affiliation(s)
- Rajesh Kumar
- Department of Pharmacy, Kerala University of Health Sciences, Thrissur, Kerala, India
| | - Seetha Harilal
- Department of Pharmacy, Kerala University of Health Sciences, Thrissur, Kerala, India
| | - Sheeba Varghese Gupta
- Department of Pharmaceutical Sciences, College of Pharmacy, University of South Florida, Tampa, FL, 33612, USA
| | - Jobin Jose
- Department of Pharmaceutics, NGSM Institute of Pharmaceutical Science, NITTE Deemed to be University, Manglore, 575018, India
| | - Della Grace Thomas Parambi
- Department of Pharmaceutical Chemistry, College of Pharmacy, Jouf University, Sakaka, Al Jouf, 2014, Saudi Arabia
| | - Md Sahab Uddin
- Department of Pharmacy, Southeast University, Dhaka, Bangladesh; Pharmakon Neuroscience Research Network, Dhaka, Bangladesh
| | - Muhammad Ajmal Shah
- Department of Pharmacogonosy, Faculty of Pharmaceutical Sciences, Government College University, Faisalabad, Pakistan
| | - Bijo Mathew
- Division of Drug Design and Medicinal Chemistry Research Lab, Department of Pharmaceutical Chemistry, Ahalia School of Pharmacy, Palakkad, 678557, Kerala, India.
| |
Collapse
|
25
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 340] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
26
|
Lin C, Ni S, Liang Y, Zeng X, Liu X. Learning to Predict Drug Target Interaction From Missing Not at Random Labels. IEEE Trans Nanobioscience 2019; 18:353-359. [DOI: 10.1109/tnb.2019.2909293] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
27
|
Abstract
Background:
Revealing the subcellular location of a newly discovered protein can
bring insight into their function and guide research at the cellular level. The experimental methods
currently used to identify the protein subcellular locations are both time-consuming and expensive.
Thus, it is highly desired to develop computational methods for efficiently and effectively identifying
the protein subcellular locations. Especially, the rapidly increasing number of protein sequences
entering the genome databases has called for the development of automated analysis methods.
Methods:
In this review, we will describe the recent advances in predicting the protein subcellular
locations with machine learning from the following aspects: i) Protein subcellular location benchmark
dataset construction, ii) Protein feature representation and feature descriptors, iii) Common
machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web
servers.
Result & Conclusion:
Concomitant with a large number of protein sequences generated by highthroughput
technologies, four future directions for predicting protein subcellular locations with
machine learning should be paid attention. One direction is the selection of novel and effective features
(e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins.
Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth
one is the protein multiple location sites prediction.
Collapse
Affiliation(s)
- Ting-He Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Shao-Wu Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, 710072, China
| |
Collapse
|
28
|
Lotfi Shahreza M, Ghadiri N, Mousavi SR, Varshosaz J, Green JR. A review of network-based approaches to drug repositioning. Brief Bioinform 2019; 19:878-892. [PMID: 28334136 DOI: 10.1093/bib/bbx017] [Citation(s) in RCA: 161] [Impact Index Per Article: 32.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Indexed: 01/17/2023] Open
Abstract
Experimental drug development is time-consuming, expensive and limited to a relatively small number of targets. However, recent studies show that repositioning of existing drugs can function more efficiently than de novo experimental drug development to minimize costs and risks. Previous studies have proven that network analysis is a versatile platform for this purpose, as the biological networks are used to model interactions between many different biological concepts. The present study is an attempt to review network-based methods in predicting drug targets for drug repositioning. For each method, the preferred type of data set is described, and their advantages and limitations are discussed. For each method, we seek to provide a brief description, as well as an evaluation based on its performance metrics.We conclude that integrating distinct and complementary data should be used because each type of data set reveals a unique aspect of information about an organism. We also suggest that applying a standard set of evaluation metrics and data sets would be essential in this fast-growing research domain.
Collapse
Affiliation(s)
- Maryam Lotfi Shahreza
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | - Nasser Ghadiri
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | | | - Jaleh Varshosaz
- Drug Delivery Systems Research Center of Isfahan University of Medical Sciences
| | - James R Green
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| |
Collapse
|
29
|
Prediction of drug-target interaction by integrating diverse heterogeneous information source with multiple kernel learning and clustering methods. Comput Biol Chem 2019; 78:460-467. [DOI: 10.1016/j.compbiolchem.2018.11.028] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 11/30/2018] [Accepted: 11/30/2018] [Indexed: 02/08/2023]
|
30
|
Lotfi Shahreza M, Ghadiri N, Green JR. Heter-LP: A Heterogeneous Label Propagation Method for Drug Repositioning. Methods Mol Biol 2019; 1903:291-316. [PMID: 30547450 DOI: 10.1007/978-1-4939-8955-3_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Using existing drugs for diseases which are not developed for their treating (drug repositioning) provides a new approach to developing drugs at a lower cost, faster, and more secured. We proposed a method for drug repositioning which can predict simple and complex relationships between drugs, drug targets, and diseases. Since biological networks typically present a suitable model for relationships between different biological concepts, our primary approach is to analyze graphs and complex networks in the study of drugs and their therapeutic effects. Given the nature of existing data, the use of semi-supervised learning methods is crucial. So, in our research, we have developed a label propagation method to predict drug-target, drug-disease, and disease-target interactions (Heter-LP), which integrates various data sources at different levels. The predicted interactions are the most prominent relationships among the millions of relationships suggested to the related researchers for further investigation. The main advantages of Heter-LP are the effective integration of input data, eliminating the need for negative samples, and the use of local and global features together. The main steps of this research are as follows. The first step is the construction of a heterogeneous network as a data modeling task, in which data are collected and prepared. The second step is predicting potential interactions. We present a new label propagation algorithm for heterogeneous networks, which consists of two parts, one mapping and the other an iterative method for determining the final labels of the entire network vertices. Finally, for evaluation, we calculated the AUC and AUPR with tenfold cross-validation and compared the results with the best available methods for label propagation in heterogeneous networks and drug repositioning. Also, a series of experimental evaluations and some specific case studies have been presented. The result of the AUC and AUPR for Heter-LP was much higher than the average of the best available methods.
Collapse
Affiliation(s)
- Maryam Lotfi Shahreza
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | - Nasser Ghadiri
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran.
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada
| |
Collapse
|
31
|
Alberga D, Trisciuzzi D, Montaruli M, Leonetti F, Mangiatordi GF, Nicolotti O. A New Approach for Drug Target and Bioactivity Prediction: The Multifingerprint Similarity Search Algorithm (MuSSeL). J Chem Inf Model 2018; 59:586-596. [PMID: 30485097 DOI: 10.1021/acs.jcim.8b00698] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
We present MuSSeL, a multifingerprint similarity search algorithm, able to predict putative drug targets for a given query small molecule as well as to return a quantitative assessment of its bioactivity in terms of Ki or IC50 values. Predictions are automatically made exploiting a large collection of high quality experimental bioactivity data available from ChEMBL (version 22.1) combining, in a consensus-like approach, predictions resulting from a similarity search performed using 13 different fingerprint definitions. Importantly, the herein proposed algorithm is also effective in detecting and handling activity cliffs. A calibration set including small molecules present in the last updated version of ChEMBL (version 23) was employed to properly tune the algorithm parameters. Three randomly built external sets were instead challenged for model performances. The potential use of MuSSeL was also challenged by a prospective exercise for the prediction of five bioactive compounds taken from articles published in the Journal of Medicinal Chemistry just few months ago. The paper emphasizes the importance of implementing multifingerprint consensus strategies to increase the confidence in prediction of similarity search algorithms and provides a fast and easy-to-run tool for drug target and bioactivity prediction.
Collapse
Affiliation(s)
- Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Michele Montaruli
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| |
Collapse
|
32
|
Zhang X, Yin J, Zhang X. A Semi-Supervised Learning Algorithm for Predicting Four Types MiRNA-Disease Associations by Mutual Information in a Heterogeneous Network. Genes (Basel) 2018; 9:genes9030139. [PMID: 29498680 PMCID: PMC5867860 DOI: 10.3390/genes9030139] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 02/20/2018] [Accepted: 02/22/2018] [Indexed: 01/05/2023] Open
Abstract
Increasing evidence suggests that dysregulation of microRNAs (miRNAs) may lead to a variety of diseases. Therefore, identifying disease-related miRNAs is a crucial problem. Currently, many computational approaches have been proposed to predict binary miRNA-disease associations. In this study, in order to predict underlying miRNA-disease association types, a semi-supervised model called the network-based label propagation algorithm is proposed to infer multiple types of miRNA-disease associations (NLPMMDA) by mutual information derived from the heterogeneous network. The NLPMMDA method integrates disease semantic similarity, miRNA functional similarity, and Gaussian interaction profile kernel similarity information of miRNAs and diseases to construct a heterogeneous network. NLPMMDA is a semi-supervised model which does not require verified negative samples. Leave-one-out cross validation (LOOCV) was implemented for four known types of miRNA-disease associations and demonstrated the reliable performance of our method. Moreover, case studies of lung cancer and breast cancer confirmed effective performance of NLPMMDA to predict novel miRNA-disease associations and their association types.
Collapse
Affiliation(s)
- Xiaotian Zhang
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China.
| | - Jian Yin
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China.
| | - Xu Zhang
- School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China.
| |
Collapse
|
33
|
Lotfi Shahreza M, Ghadiri N, Mousavi SR, Varshosaz J, Green JR. Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioning. J Biomed Inform 2017; 68:167-183. [DOI: 10.1016/j.jbi.2017.03.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2016] [Revised: 02/09/2017] [Accepted: 03/10/2017] [Indexed: 12/14/2022]
|