1
|
Choudhury C, Arul Murugan N, Deva Priyakumar U. Structure-based drug repurposing: traditional and advanced AI/ML-aided methods. Drug Discov Today 2022; 27:1847-1861. [PMID: 35301148 PMCID: PMC8920090 DOI: 10.1016/j.drudis.2022.03.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/16/2022] [Accepted: 03/10/2022] [Indexed: 02/08/2023]
Abstract
The current global health emergency in the form of the Coronavirus 2019 (COVID-19) pandemic has highlighted the need for fast, accurate, and efficient drug discovery pipelines. Traditional drug discovery projects relying on in vitro high-throughput screening (HTS) involve large investments and sophisticated experimental set-ups, affordable only to big biopharmaceutical companies. In this scenario, application of efficient state-of-the-art computational methods and modern artificial intelligence (AI)-based algorithms for rapid screening of repurposable chemical space [approved drugs and natural products (NPs) with proven pharmacokinetic profiles] to identify the initial leads is a powerful option to save resources and time. Structure-based drug repurposing is a popular in silico repurposing approach. In this review, we discuss traditional and modern AI-based computational methods and tools applied at various stages for structure-based drug discovery (SBDD) pipelines. Additionally, we highlight the role of generative models in generating molecules with scaffolds from repurposable chemical space. Teaser: This review highlights the importance of repurposable chemical space, and the contributions of conventional in silico approaches and modern machine-learning algorithms for rapid structure-based drug repurposing.
Collapse
Affiliation(s)
- Chinmayee Choudhury
- Department of Experimental Medicine and Biotechnology, Postgraduate Institute of Medical Education and Research, Sector-12, Chandigarh 160012, India
| | - N Arul Murugan
- Department of Computer Science, School of Electrical Engineering and Computer Sciences, KTH Royal Institute of Technology, S-100 44, Stockholm, Sweden; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi 110020, India.
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
2
|
Sun J, Lu Y, Cui L, Fu Q, Wu H, Chen J. A Method of Optimizing Weight Allocation in Data Integration Based on Q-Learning for Drug-Target Interaction Prediction. Front Cell Dev Biol 2022; 10:794413. [PMID: 35356288 PMCID: PMC8959213 DOI: 10.3389/fcell.2022.794413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 02/14/2022] [Indexed: 11/26/2022] Open
Abstract
Calculating and predicting drug-target interactions (DTIs) is a crucial step in the field of novel drug discovery. Nowadays, many models have improved the prediction performance of DTIs by fusing heterogeneous information, such as drug chemical structure and target protein sequence and so on. However, in the process of fusion, how to allocate the weight of heterogeneous information reasonably is a huge challenge. In this paper, we propose a model based on Q-learning algorithm and Neighborhood Regularized Logistic Matrix Factorization (QLNRLMF) to predict DTIs. First, we obtain three different drug-drug similarity matrices and three different target-target similarity matrices by using different similarity calculation methods based on heterogeneous data, including drug chemical structure, target protein sequence and drug-target interactions. Then, we initialize a set of weights for the drug-drug similarity matrices and target-target similarity matrices respectively, and optimize them through Q-learning algorithm. When the optimal weights are obtained, a new drug-drug similarity matrix and a new drug-drug similarity matrix are obtained by linear combination. Finally, the drug target interaction matrix, the new drug-drug similarity matrices and the target-target similarity matrices are used as inputs to the Neighborhood Regularized Logistic Matrix Factorization (NRLMF) model for DTIs. Compared with the existing six methods of NetLapRLS, BLM-NII, WNN-GIP, KBMF2K, CMF, and NRLMF, our proposed method has achieved better effect in the four benchmark datasets, including enzymes(E), nuclear receptors (NR), ion channels (IC) and G protein coupled receptors (GPCR).
Collapse
Affiliation(s)
- Jiacheng Sun
- School of Electronic and Information Engineering, SuZhou University of Science and Technology, Suzhou, China
- Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, China
- Suzhou Key Laboratory of Mobile Network Technology and Application, Suzhou University of Science and Technology, Suzhou, China
| | - You Lu
- School of Electronic and Information Engineering, SuZhou University of Science and Technology, Suzhou, China
- Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, China
- Suzhou Key Laboratory of Mobile Network Technology and Application, Suzhou University of Science and Technology, Suzhou, China
- *Correspondence: You Lu, ; Jianping Chen,
| | - Linqian Cui
- School of Electronic and Information Engineering, SuZhou University of Science and Technology, Suzhou, China
- Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, China
- Suzhou Key Laboratory of Mobile Network Technology and Application, Suzhou University of Science and Technology, Suzhou, China
| | - Qiming Fu
- School of Electronic and Information Engineering, SuZhou University of Science and Technology, Suzhou, China
- Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, China
- Suzhou Key Laboratory of Mobile Network Technology and Application, Suzhou University of Science and Technology, Suzhou, China
| | - Hongjie Wu
- School of Electronic and Information Engineering, SuZhou University of Science and Technology, Suzhou, China
| | - Jianping Chen
- Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou University of Science and Technology, Suzhou, China
- School of Architecture and Urban Planning, Suzhou University of Science and Technology, Suzhou, China
- *Correspondence: You Lu, ; Jianping Chen,
| |
Collapse
|
3
|
Zhang P, Wei Z, Che C, Jin B. DeepMGT-DTI: Transformer network incorporating multilayer graph information for Drug-Target interaction prediction. Comput Biol Med 2022; 142:105214. [PMID: 35030496 DOI: 10.1016/j.compbiomed.2022.105214] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 12/26/2021] [Accepted: 01/02/2022] [Indexed: 12/29/2022]
Abstract
Drug-target interaction (DTI) prediction reduces the cost and time of drug development, and plays a vital role in drug discovery. However, most of research does not fully explore the molecular structures of drug compounds in DTI prediction. To this end, we propose a deep learning model to capture the molecular structure information of drug compounds for DTI prediction. This model utilizes a transformer network incorporating multilayer graph information, which captures the features of a drug's molecular structure so that the interactions between atoms of drug compounds can be explored more deeply. At the same time, a convolutional neural network is employed to capture the local residue information in the target sequence, and effectively extract the feature information of the target. The experiments on the DrugBank dataset showed that the proposed model outperformed previous models based on the structure of target sequences. The results indicate that the improved transformer network fuses the feature information between layers in the graph convolutional neural network and extracts the interaction data for the molecular structure. The drug repositioning experiment on COVID-19 and Alzheimer's disease demonstrated the proposed model's ability to find therapeutic drugs in drug discovery. The code of our model is available at https://github.com/zhangpl109/DeepMGT-DTI.
Collapse
Affiliation(s)
- Peiliang Zhang
- Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian, 116622, China.
| | - Ziqi Wei
- School of Software, Tsinghua University, Beijing, 100084, China.
| | - Chao Che
- Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian, 116622, China.
| | - Bo Jin
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian, 116024, China.
| |
Collapse
|
4
|
Identification of drug-target interactions via multi-view graph regularized link propagation model. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.100] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
5
|
Najm M, Azencott CA, Playe B, Stoven V. Drug Target Identification with Machine Learning: How to Choose Negative Examples. Int J Mol Sci 2021; 22:ijms22105118. [PMID: 34066072 PMCID: PMC8151112 DOI: 10.3390/ijms22105118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 04/30/2021] [Accepted: 05/07/2021] [Indexed: 11/24/2022] Open
Abstract
Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases’ statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.
Collapse
Affiliation(s)
- Matthieu Najm
- Center for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France; (C.-A.A.); (B.P.); (V.S.)
- Institut Curie, 75248 Paris, France
- INSERM U900, 75428 Paris, France
- Correspondence:
| | - Chloé-Agathe Azencott
- Center for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France; (C.-A.A.); (B.P.); (V.S.)
- Institut Curie, 75248 Paris, France
- INSERM U900, 75428 Paris, France
| | - Benoit Playe
- Center for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France; (C.-A.A.); (B.P.); (V.S.)
- Institut Curie, 75248 Paris, France
- INSERM U900, 75428 Paris, France
| | - Véronique Stoven
- Center for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France; (C.-A.A.); (B.P.); (V.S.)
- Institut Curie, 75248 Paris, France
- INSERM U900, 75428 Paris, France
| |
Collapse
|
6
|
Vatansever S, Schlessinger A, Wacker D, Kaniskan HÜ, Jin J, Zhou M, Zhang B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med Res Rev 2021; 41:1427-1473. [PMID: 33295676 PMCID: PMC8043990 DOI: 10.1002/med.21764] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/20/2020] [Indexed: 01/11/2023]
Abstract
Neurological disorders significantly outnumber diseases in other therapeutic areas. However, developing drugs for central nervous system (CNS) disorders remains the most challenging area in drug discovery, accompanied with the long timelines and high attrition rates. With the rapid growth of biomedical data enabled by advanced experimental technologies, artificial intelligence (AI) and machine learning (ML) have emerged as an indispensable tool to draw meaningful insights and improve decision making in drug discovery. Thanks to the advancements in AI and ML algorithms, now the AI/ML-driven solutions have an unprecedented potential to accelerate the process of CNS drug discovery with better success rate. In this review, we comprehensively summarize AI/ML-powered pharmaceutical discovery efforts and their implementations in the CNS area. After introducing the AI/ML models as well as the conceptualization and data preparation, we outline the applications of AI/ML technologies to several key procedures in drug discovery, including target identification, compound screening, hit/lead generation and optimization, drug response and synergy prediction, de novo drug design, and drug repurposing. We review the current state-of-the-art of AI/ML-guided CNS drug discovery, focusing on blood-brain barrier permeability prediction and implementation into therapeutic discovery for neurological diseases. Finally, we discuss the major challenges and limitations of current approaches and possible future directions that may provide resolutions to these difficulties.
Collapse
Affiliation(s)
- Sezen Vatansever
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Avner Schlessinger
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Daniel Wacker
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of NeuroscienceIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - H. Ümit Kaniskan
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Jian Jin
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Ming‐Ming Zhou
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Bin Zhang
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| |
Collapse
|
7
|
Wang Y, Yang Y, Chen S, Wang J. DeepDRK: a deep learning framework for drug repurposing through kernel-based multi-omics integration. Brief Bioinform 2021; 22:6210072. [PMID: 33822890 DOI: 10.1093/bib/bbab048] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 01/16/2021] [Accepted: 01/30/2021] [Indexed: 12/11/2022] Open
Abstract
Recent pharmacogenomic studies that generate sequencing data coupled with pharmacological characteristics for patient-derived cancer cell lines led to large amounts of multi-omics data for precision cancer medicine. Among various obstacles hindering clinical translation, lacking effective methods for multimodal and multisource data integration is becoming a bottleneck. Here we proposed DeepDRK, a machine learning framework for deciphering drug response through kernel-based data integration. To transfer information among different drugs and cancer types, we trained deep neural networks on more than 20 000 pan-cancer cell line-anticancer drug pairs. These pairs were characterized by kernel-based similarity matrices integrating multisource and multi-omics data including genomics, transcriptomics, epigenomics, chemical properties of compounds and known drug-target interactions. Applied to benchmark cancer cell line datasets, our model surpassed previous approaches with higher accuracy and better robustness. Then we applied our model on newly established patient-derived cancer cell lines and achieved satisfactory performance with AUC of 0.84 and AUPRC of 0.77. Moreover, DeepDRK was used to predict clinical response of cancer patients. Notably, the prediction of DeepDRK correlated well with clinical outcome of patients and revealed multiple drug repurposing candidates. In sum, DeepDRK provided a computational method to predict drug response of cancer cells from integrating pharmacogenomic datasets, offering an alternative way to prioritize repurposing drugs in precision cancer treatment. The DeepDRK is freely available via https://github.com/wangyc82/DeepDRK.
Collapse
Affiliation(s)
- Yongcui Wang
- Key Laboratory of Adaptation and Evolution of Plateau Biota at Northwest Institute of Plateau Biology, Chinese Academy of Sciences, China
| | - Yingxi Yang
- Department of Chemical and Biological Engineering at The Hong Kong University of Science and Technology, China
| | - Shilong Chen
- Key Laboratory of Adaptation and Evolution of Plateau Biota at Institute of Sanjiangyuan National Park, Chinese Academy of Sciences, China
| | - Jiguang Wang
- Division of Life Science, Department of Chemical and Biological Engineering, and State Key Laboratory of Molecular Neuroscience at The Hong Kong University of Science and Technology, China
| |
Collapse
|
8
|
Patel L, Shukla T, Huang X, Ussery DW, Wang S. Machine Learning Methods in Drug Discovery. Molecules 2020; 25:E5277. [PMID: 33198233 PMCID: PMC7696134 DOI: 10.3390/molecules25225277] [Citation(s) in RCA: 118] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 12/30/2022] Open
Abstract
The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.
Collapse
Affiliation(s)
- Lauv Patel
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| | - Tripti Shukla
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| | - Xiuzhen Huang
- Department of Computer Science, Arkansas State University, Jonesboro, AR 72467, USA;
| | - David W. Ussery
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA;
| | - Shanzhi Wang
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| |
Collapse
|
9
|
Identification of Drug–Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106254] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
10
|
Cao DS, Jiang SL, Guan YD, Chen XS, Zhang LX, Zhang Y, Chen AF, Yang JM, Cheng Y. A multi-scale systems pharmacology approach uncovers the anti-cancer molecular mechanism of Ixabepilone. Eur J Med Chem 2020; 199:112421. [PMID: 32428794 DOI: 10.1016/j.ejmech.2020.112421] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 04/29/2020] [Accepted: 05/03/2020] [Indexed: 12/21/2022]
Abstract
It has been realized that FDA approved drugs may have more molecular targets than is commonly thought. Thus, to find the exact drug-target interactions (DTIs) is of great significance for exploring the new molecular mechanism of drugs. Here, we developed a multi-scale system pharmacology (MSSP) method for the large-scale prediction of DTIs. We used MSSP to integrate drug-related and target-related data from multiple levels, the network structural data formed by known drug-target relationships for predicting likely unknown DTIs. Prediction results revealed that Ixabepilone, an epothilone B analog for treating breast cancer patients, may target Bcl-2, an oncogene that contributes to tumor progression and therapy resistance by inhibiting apoptosis. Furthermore, we demonstrated that Ixabepilone could bind with Bcl-2 and decrease its protein expression in breast cancer cells. The down-regulation of Bcl-2 by Ixabepilone is resulted from promoting its degradation by affecting p-Bcl-2. We further found that Ixabepilone could induce autophagy by releasing Beclin1 from Beclin1/Bcl-2 complex. Inhibition of autophagy by knockdown of Beclin1 or pharmacological inhibitor augmented apoptosis, thus enhancing the antitumor efficacy of Ixabepilone against breast cancer cells in vitro and in vivo. In addition, Ixabepilone also decreases Bcl-2 protein expression and induces cytoprotective autophagy in human hepatic carcinoma and glioma cells. In conclusion, this study not only provides a feasible and alternative way exploring new molecular mechanisms of drugs by combing computation DTI prediction, but also reveals an effective strategy to reinforce the antitumor efficacy of Ixabepilone.
Collapse
Affiliation(s)
- Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China.
| | - Shi-Long Jiang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China; Department of Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, Hunan, 410011, China
| | - Yi-Di Guan
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China; Department of Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, Hunan, 410011, China
| | - Xi-Sha Chen
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China; Department of Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, Hunan, 410011, China
| | - Liu-Xia Zhang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China
| | - Yi Zhang
- Department of Pharmacology, College of Pharmaceutical Sciences, Soochow University, Suzhou, China, 215000, China
| | - Alex F Chen
- Center for Vascular Disease and Translational Medicine, The Third Xiangya Hospital of Central South University, Changsha, 410013, PR China
| | - Jin-Ming Yang
- Department of Cancer Biology and Toxicology, College of Medicine, Markey Cancer Center, University of Kentucky, Lexington, KY, 40536, USA
| | - Yan Cheng
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410008, China; Department of Pharmacy, The Second Xiangya Hospital, Central South University, Changsha, Hunan, 410011, China.
| |
Collapse
|
11
|
Wang YB, You ZH, Yang S, Yi HC, Chen ZH, Zheng K. A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak 2020; 20:49. [PMID: 32183788 PMCID: PMC7079345 DOI: 10.1186/s12911-020-1052-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Background The key to modern drug discovery is to find, identify and prepare drug molecular targets. However, due to the influence of throughput, precision and cost, traditional experimental methods are difficult to be widely used to infer these potential Drug-Target Interactions (DTIs). Therefore, it is urgent to develop effective computational methods to validate the interaction between drugs and target. Methods We developed a deep learning-based model for DTIs prediction. The proteins evolutionary features are extracted via Position Specific Scoring Matrix (PSSM) and Legendre Moment (LM) and associated with drugs molecular substructure fingerprints to form feature vectors of drug-target pairs. Then we utilized the Sparse Principal Component Analysis (SPCA) to compress the features of drugs and proteins into a uniform vector space. Lastly, the deep long short-term memory (DeepLSTM) was constructed for carrying out prediction. Results A significant improvement in DTIs prediction performance can be observed on experimental results, with AUC of 0.9951, 0.9705, 0.9951, 0.9206, respectively, on four classes important drug-target datasets. Further experiments preliminary proves that the proposed characterization scheme has great advantage on feature expression and recognition. We also have shown that the proposed method can work well with small dataset. Conclusion The results demonstration that the proposed approach has a great advantage over state-of-the-art drug-target predictor. To the best of our knowledge, this study first tests the potential of deep learning method with memory and Turing completeness in DTIs prediction.
Collapse
Affiliation(s)
- Yan-Bin Wang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.
| | - Shan Yang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhan-Heng Chen
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,Department of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Kai Zheng
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China
| |
Collapse
|
12
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
13
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
14
|
Ding Y, Tang J, Guo F. Identification of Drug-Side Effect Association via Semisupervised Model and Multiple Kernel Learning. IEEE J Biomed Health Inform 2019; 23:2619-2632. [DOI: 10.1109/jbhi.2018.2883834] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
15
|
Ding Y, Tang J, Guo F. Identification of drug–target interactions via fuzzy bipartite local model. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04569-z] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
16
|
Nascimento ACA, Prudêncio RBC, Costa IG. A Drug-Target Network-Based Supervised Machine Learning Repurposing Method Allowing the Use of Multiple Heterogeneous Information Sources. Methods Mol Biol 2019; 1903:281-289. [PMID: 30547449 DOI: 10.1007/978-1-4939-8955-3_17] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Drug-target networks have an important role in pharmaceutical innovation, drug lead discovery, and recent drug repositioning tasks. Many different in silico approaches for the identification of new drug-target interactions have been proposed, many of them based on a particular class of machine learning algorithms called kernel methods. These pattern classification algorithms are able to incorporate previous knowledge in the form of similarity functions, i.e., a kernel, and they have been successful in a wide range of supervised learning problems. The selection of the right kernel function and its respective parameters can have a large influence on the performance of the classifier. Recently, multiple kernel learning algorithms have been introduced to address this problem, enabling one to combine multiple kernels into large drug-target interaction spaces in order to integrate multiple sources of biological information simultaneously. The Kronecker regularized least squares with multiple kernel learning (KronRLS-MKL) is a machine learning algorithm that aims at integrating heterogeneous information sources into a single chemogenomic space to predict new drug-target interactions. This chapter describes how to obtain data from heterogeneous sources and how to implement and use KronRLS-MKL to predict new interactions.
Collapse
Affiliation(s)
| | | | - Ivan G Costa
- Institute for Computational Genomics, Centre of Medical Technology (MTZ), RWTH Aachen University Medical School, Aachen, Germany
| |
Collapse
|
17
|
Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.028] [Citation(s) in RCA: 148] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
18
|
Sharma A, Rani R. BE-DTI': Ensemble framework for drug target interaction prediction using dimensionality reduction and active learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 165:151-162. [PMID: 30337070 DOI: 10.1016/j.cmpb.2018.08.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/03/2018] [Accepted: 08/17/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Drug-target interaction prediction plays an intrinsic role in the drug discovery process. Prediction of novel drugs and targets helps in identifying optimal drug therapies for various stringent diseases. Computational prediction of drug-target interactions can help to identify potential drug-target pairs and speed-up the process of drug repositioning. In our present, work we have focused on machine learning algorithms for predicting drug-target interactions from the pool of existing drug-target data. The key idea is to train the classifier using existing DTI so as to predict new or unknown DTI. However, there are various challenges such as class imbalance and high dimensional nature of data that need to be addressed before developing optimal drug-target interaction model. METHODS In this paper, we propose a bagging based ensemble framework named BE-DTI' for drug-target interaction prediction using dimensionality reduction and active learning to deal with class-imbalanced data. Active learning helps to improve under-sampling bagging based ensembles. Dimensionality reduction is used to deal with high dimensional data. RESULTS Results show that the proposed technique outperforms the other five competing methods in 10-fold cross-validation experiments in terms of AUC=0.927, Sensitivity=0.886, Specificity=0.864, and G-mean=0.874. CONCLUSION Missing interactions and new interactions are predicted using the proposed framework. Some of the known interactions are removed from the original dataset and their interactions are recalculated to check the accuracy of the proposed framework. Moreover, validation of the proposed approach is performed using the external dataset. All these results show that structurally similar drugs tend to interact with similar targets.
Collapse
Affiliation(s)
- Aman Sharma
- Computer Science and Engineering Department, Thapar Institute of Engineering & Technology, Punjab, Patiala, India.
| | - Rinkle Rani
- Computer Science and Engineering Department, Thapar Institute of Engineering & Technology, Punjab, Patiala, India.
| |
Collapse
|
19
|
Huang G, Li J, Zhao C. Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites. Molecules 2018; 23:molecules23040954. [PMID: 29671802 PMCID: PMC6017196 DOI: 10.3390/molecules23040954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 03/30/2018] [Accepted: 04/09/2018] [Indexed: 01/12/2023] Open
Abstract
Interactions between drugs and proteins occupy a central position during the process of drug discovery and development. Numerous methods have recently been developed for identifying drug–target interactions, but few have been devoted to finding interactions between post-translationally modified proteins and drugs. We presented a machine learning-based method for identifying associations between small molecules and binding-associated S-nitrosylated (SNO-) proteins. Namely, small molecules were encoded by molecular fingerprint, SNO-proteins were encoded by the information entropy-based method, and the random forest was used to train a classifier. Ten-fold and leave-one-out cross validations achieved, respectively, 0.7235 and 0.7490 of the area under a receiver operating characteristic curve. Computational analysis of similarity suggested that SNO-proteins associated with the same drug shared statistically significant similarity, and vice versa. This method and finding are useful to identify drug–SNO associations and further facilitate the discovery and development of SNO-associated drugs.
Collapse
Affiliation(s)
- Guohua Huang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Jincheng Li
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Chenglin Zhao
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| |
Collapse
|
20
|
Ding Y, Tang J, Guo F. Identification of drug-target interactions via multiple information integration. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2017.08.045] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
21
|
DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction. BIOMED RESEARCH INTERNATIONAL 2017; 2017:6340316. [PMID: 28744468 PMCID: PMC5514335 DOI: 10.1155/2017/6340316] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Revised: 04/10/2017] [Accepted: 05/03/2017] [Indexed: 12/21/2022]
Abstract
Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors.
Collapse
|
22
|
Zhang J, Zhu M, Chen P, Wang B. DrugRPE: Random projection ensemble approach to drug-target interaction prediction. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.10.039] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
23
|
SELF-BLM: Prediction of drug-target interactions via self-training SVM. PLoS One 2017; 12:e0171839. [PMID: 28192537 PMCID: PMC5305209 DOI: 10.1371/journal.pone.0171839] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 01/26/2017] [Indexed: 01/08/2023] Open
Abstract
Predicting drug-target interactions is important for the development of novel drugs and the repositioning of drugs. To predict such interactions, there are a number of methods based on drug and target protein similarity. Although these methods, such as the bipartite local model (BLM), show promise, they often categorize unknown interactions as negative interaction. Therefore, these methods are not ideal for finding potential drug-target interactions that have not yet been validated as positive interactions. Thus, here we propose a method that integrates machine learning techniques, such as self-training support vector machine (SVM) and BLM, to develop a self-training bipartite local model (SELF-BLM) that facilitates the identification of potential interactions. The method first categorizes unlabeled interactions and negative interactions among unknown interactions using a clustering method. Then, using the BLM method and self-training SVM, the unlabeled interactions are self-trained and final local classification models are constructed. When applied to four classes of proteins that include enzymes, G-protein coupled receptors (GPCRs), ion channels, and nuclear receptors, SELF-BLM showed the best performance for predicting not only known interactions but also potential interactions in three protein classes compare to other related studies. The implemented software and supporting data are available at https://github.com/GIST-CSBL/SELF-BLM.
Collapse
|
24
|
Wang C, Liu J, Luo F, Hu QN. Multi-fields model for predicting target–ligand interaction. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.03.079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
25
|
Nascimento ACA, Prudêncio RBC, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC Bioinformatics 2016; 17:46. [PMID: 26801218 PMCID: PMC4722636 DOI: 10.1186/s12859-016-0890-3] [Citation(s) in RCA: 111] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Accepted: 01/05/2016] [Indexed: 12/19/2022] Open
Abstract
Background Drug-target networks are receiving a lot of attention in late years, given its relevance for pharmaceutical innovation and drug lead discovery. Different in silico approaches have been proposed for the identification of new drug-target interactions, many of which are based on kernel methods. Despite technical advances in the latest years, these methods are not able to cope with large drug-target interaction spaces and to integrate multiple sources of biological information. Results We propose KronRLS-MKL, which models the drug-target interaction problem as a link prediction task on bipartite networks. This method allows the integration of multiple heterogeneous information sources for the identification of new interactions, and can also work with networks of arbitrary size. Moreover, it automatically selects the more relevant kernels by returning weights indicating their importance in the drug-target prediction at hand. Empirical analysis on four data sets using twenty distinct kernels indicates that our method has higher or comparable predictive performance than 18 competing methods in all prediction tasks. Moreover, the predicted weights reflect the predictive quality of each kernel on exhaustive pairwise experiments, which indicates the success of the method to automatically reveal relevant biological sources. Conclusions Our analysis show that the proposed data integration strategy is able to improve the quality of the predicted interactions, and can speed up the identification of new drug-target interactions as well as identify relevant information for the task. Availability The source code and data sets are available at www.cin.ufpe.br/~acan/kronrlsmkl/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0890-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- André C A Nascimento
- Center of Informatics, UFPE, Recife, Brazil. .,Department of Statistics and Informatics, UFRPE, Recife, Brazil. .,IZKF Computational Biology Research Group, Institute for Biomedical Engineering, RWTH Aachen University Medical School, Aachen, Germany.
| | | | - Ivan G Costa
- Center of Informatics, UFPE, Recife, Brazil. .,IZKF Computational Biology Research Group, Institute for Biomedical Engineering, RWTH Aachen University Medical School, Aachen, Germany. .,Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTH Aachen University, Aachen, Germany.
| |
Collapse
|
26
|
Yan XY, Zhang SW, Zhang SY. Prediction of drug–target interaction by label propagation with mutual interaction information derived from heterogeneous network. MOLECULAR BIOSYSTEMS 2016; 12:520-31. [DOI: 10.1039/c5mb00615e] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
By implementing label propagation on drug/target similarity network with mutual interaction information derived from drug–target heterogeneous network, LPMIHN algorithm identifies potential drug–target interactions.
Collapse
Affiliation(s)
- Xiao-Ying Yan
- Key Laboratory of Information Fusion Technology of Ministry of Education
- School of Automation
- Northwestern Polytechnical University
- Xi'an
- China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education
- School of Automation
- Northwestern Polytechnical University
- Xi'an
- China
| | - Song-Yao Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education
- School of Automation
- Northwestern Polytechnical University
- Xi'an
- China
| |
Collapse
|
27
|
Peng L, Liao B, Zhu W, Li Z, Li K. Predicting Drug-Target Interactions With Multi-Information Fusion. IEEE J Biomed Health Inform 2015; 21:561-572. [PMID: 26731781 DOI: 10.1109/jbhi.2015.2513200] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Identifying potential associations between drugs and targets is a critical prerequisite for modern drug discovery and repurposing. However, predicting these associations is difficult because of the limitations of existing computational methods. Most models only consider chemical structures and protein sequences, and other models are oversimplified. Moreover, datasets used for analysis contain only true-positive interactions, and experimentally validated negative samples are unavailable. To overcome these limitations, we developed a semi-supervised based learning framework called NormMulInf through collaborative filtering theory by using labeled and unlabeled interaction information. The proposed method initially determines similarity measures, such as similarities among samples and local correlations among the labels of the samples, by integrating biological information. The similarity information is then integrated into a robust principal component analysis model, which is solved using augmented Lagrange multipliers. Experimental results on four classes of drug-target interaction networks suggest that the proposed approach can accurately classify and predict drug-target interactions. Part of the predicted interactions are reported in public databases. The proposed method can also predict possible targets for new drugs and can be used to determine whether atropine may interact with alpha1B- and beta1- adrenergic receptors. Furthermore, the developed technique identifies potential drugs for new targets and can be used to assess whether olanzapine and propiomazine may target 5HT2B. Finally, the proposed method can potentially address limitations on studies of multitarget drugs and multidrug targets.
Collapse
|
28
|
Chen FS, Jiang ZR. Prediction of drug’s Anatomical Therapeutic Chemical (ATC) code by integrating drug–domain network. J Biomed Inform 2015; 58:80-88. [DOI: 10.1016/j.jbi.2015.09.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 09/14/2015] [Accepted: 09/22/2015] [Indexed: 10/22/2022]
|
29
|
Drug-target interaction prediction from PSSM based evolutionary information. J Pharmacol Toxicol Methods 2015; 78:42-51. [PMID: 26592807 DOI: 10.1016/j.vascn.2015.11.002] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 11/01/2015] [Accepted: 11/08/2015] [Indexed: 01/01/2023]
Abstract
The labor-intensive and expensive experimental process of drug-target interaction prediction has motivated many researchers to focus on in silico prediction, which leads to the helpful information in supporting the experimental interaction data. Therefore, they have proposed several computational approaches for discovering new drug-target interactions. Several learning-based methods have been increasingly developed which can be categorized into two main groups: similarity-based and feature-based. In this paper, we firstly use the bi-gram features extracted from the Position Specific Scoring Matrix (PSSM) of proteins in predicting drug-target interactions. Our results demonstrate the high-confidence prediction ability of the Bigram-PSSM model in terms of several performance indicators specifically for enzymes and ion channels. Moreover, we investigate the impact of negative selection strategy on the performance of the prediction, which is not widely taken into account in the other relevant studies. This is important, as the number of non-interacting drug-target pairs are usually extremely large in comparison with the number of interacting ones in existing drug-target interaction data. An interesting observation is that different levels of performance reduction have been attained for four datasets when we change the sampling method from the random sampling to the balanced sampling.
Collapse
|
30
|
Wang YC, Chen SL, Deng NY, Wang Y. Computational probing protein-protein interactions targeting small molecules. Bioinformatics 2015; 32:226-34. [PMID: 26415726 DOI: 10.1093/bioinformatics/btv528] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 08/31/2015] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION With the booming of interactome studies, a lot of interactions can be measured in a high throughput way and large scale datasets are available. It is becoming apparent that many different types of interactions can be potential drug targets. Compared with inhibition of a single protein, inhibition of protein-protein interaction (PPI) is promising to improve the specificity with fewer adverse side-effects. Also it greatly broadens the drug target search space, which makes the drug target discovery difficult. Computational methods are highly desired to efficiently provide candidates for further experiments and hold the promise to greatly accelerate the discovery of novel drug targets. RESULTS Here, we propose a machine learning method to predict PPI targets in a genomic-wide scale. Specifically, we develop a computational method, named as PrePPItar, to Predict PPIs as drug targets by uncovering the potential associations between drugs and PPIs. First, we survey the databases and manually construct a gold-standard positive dataset for drug and PPI interactions. This effort leads to a dataset with 227 associations among 63 PPIs and 113 FDA-approved drugs and allows us to build models to learn the association rules from the data. Second, we characterize drugs by profiling in chemical structure, drug ATC-code annotation, and side-effect space and represent PPI similarity by a symmetrical S-kernel based on protein amino acid sequence. Then the drugs and PPIs are correlated by Kronecker product kernel. Finally, a support vector machine (SVM), is trained to predict novel associations between drugs and PPIs. We validate our PrePPItar method on the well-established gold-standard dataset by cross-validation. We find that all chemical structure, drug ATC-code, and side-effect information are predictive for PPI target. Moreover, we can increase the PPI target prediction coverage by integrating multiple data sources. Follow-up database search and pathway analysis indicate that our new predictions are worthy of future experimental validation. CONCLUSION In conclusion, PrePPItar can serve as a useful tool for PPI target discovery and provides a general heterogeneous data integrative framework. AVAILABILITY AND IMPLEMENTATION PrePPItar is available at http://doc.aporc.org/wiki/PrePPItar. CONTACT ycwang@nwipb.cas.cn or ywang@amss.ac.cn SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yong-Cui Wang
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining 810001, China
| | - Shi-Long Chen
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining 810001, China
| | - Nai-Yang Deng
- College of Science, China Agricultural University, Beijing 100083, China and
| | - Yong Wang
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
| |
Collapse
|
31
|
Liu H, Sun J, Guan J, Zheng J, Zhou S. Improving compound-protein interaction prediction by building up highly credible negative samples. Bioinformatics 2015; 31:i221-9. [PMID: 26072486 PMCID: PMC4765858 DOI: 10.1093/bioinformatics/btv256] [Citation(s) in RCA: 151] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Computational prediction of compound-protein interactions (CPIs) is of great importance for drug design and development, as genome-scale experimental validation of CPIs is not only time-consuming but also prohibitively expensive. With the availability of an increasing number of validated interactions, the performance of computational prediction approaches is severely impended by the lack of reliable negative CPI samples. A systematic method of screening reliable negative sample becomes critical to improving the performance of in silico prediction methods. RESULTS This article aims at building up a set of highly credible negative samples of CPIs via an in silico screening method. As most existing computational models assume that similar compounds are likely to interact with similar target proteins and achieve remarkable performance, it is rational to identify potential negative samples based on the converse negative proposition that the proteins dissimilar to every known/predicted target of a compound are not much likely to be targeted by the compound and vice versa. We integrated various resources, including chemical structures, chemical expression profiles and side effects of compounds, amino acid sequences, protein-protein interaction network and functional annotations of proteins, into a systematic screening framework. We first tested the screened negative samples on six classical classifiers, and all these classifiers achieved remarkably higher performance on our negative samples than on randomly generated negative samples for both human and Caenorhabditis elegans. We then verified the negative samples on three existing prediction models, including bipartite local model, Gaussian kernel profile and Bayesian matrix factorization, and found that the performances of these models are also significantly improved on the screened negative samples. Moreover, we validated the screened negative samples on a drug bioactivity dataset. Finally, we derived two sets of new interactions by training an support vector machine classifier on the positive interactions annotated in DrugBank and our screened negative interactions. The screened negative samples and the predicted interactions provide the research community with a useful resource for identifying new drug targets and a helpful supplement to the current curated compound-protein databases. AVAILABILITY Supplementary files are available at: http://admis.fudan.edu.cn/negative-cpi/.
Collapse
Affiliation(s)
- Hui Liu
- Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Jianjiang Sun
- Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Jihong Guan
- Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Jie Zheng
- Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Shuigeng Zhou
- Lab of Information Management, Changzhou University, Jiangsu 213164, China, School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore, Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China and Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| |
Collapse
|
32
|
Cereto-Massagué A, Ojeda MJ, Valls C, Mulero M, Pujadas G, Garcia-Vallve S. Tools for in silico target fishing. Methods 2015; 71:98-103. [DOI: 10.1016/j.ymeth.2014.09.006] [Citation(s) in RCA: 92] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Revised: 09/18/2014] [Accepted: 09/19/2014] [Indexed: 12/17/2022] Open
|
33
|
Korkmaz S, Zararsiz G, Goksuluk D. Drug/nondrug classification using Support Vector Machines with various feature selection strategies. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014; 117:51-60. [PMID: 25224081 DOI: 10.1016/j.cmpb.2014.08.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 08/15/2014] [Accepted: 08/27/2014] [Indexed: 06/03/2023]
Abstract
In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM-Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery.
Collapse
Affiliation(s)
- Selcuk Korkmaz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey.
| | - Gokmen Zararsiz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| | - Dincer Goksuluk
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| |
Collapse
|
34
|
Sugaya N. Ligand efficiency-based support vector regression models for predicting bioactivities of ligands to drug target proteins. J Chem Inf Model 2014; 54:2751-63. [PMID: 25220713 DOI: 10.1021/ci5003262] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The concept of ligand efficiency (LE) indices is widely accepted throughout the drug design community and is frequently used in a retrospective manner in the process of drug development. For example, LE indices are used to investigate LE optimization processes of already-approved drugs and to re-evaluate hit compounds obtained from structure-based virtual screening methods and/or high-throughput experimental assays. However, LE indices could also be applied in a prospective manner to explore drug candidates. Here, we describe the construction of machine learning-based regression models in which LE indices are adopted as an end point and show that LE-based regression models can outperform regression models based on pIC50 values. In addition to pIC50 values traditionally used in machine learning studies based on chemogenomics data, three representative LE indices (ligand lipophilicity efficiency (LLE), binding efficiency index (BEI), and surface efficiency index (SEI)) were adopted, then used to create four types of training data. We constructed regression models by applying a support vector regression (SVR) method to the training data. In cross-validation tests of the SVR models, the LE-based SVR models showed higher correlations between the observed and predicted values than the pIC50-based models. Application tests to new data displayed that, generally, the predictive performance of SVR models follows the order SEI > BEI > LLE > pIC50. Close examination of the distributions of the activity values (pIC50, LLE, BEI, and SEI) in the training and validation data implied that the performance order of the SVR models may be ascribed to the much higher diversity of the LE-based training and validation data. In the application tests, the LE-based SVR models can offer better predictive performance of compound-protein pairs with a wider range of ligand potencies than the pIC50-based models. This finding strongly suggests that LE-based SVR models are better than pIC50-based models at predicting bioactivities of compounds that could exhibit a much higher (or lower) potency.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc. , Hatchobori 2-19-8, Chuo-ku, Tokyo 104-0032, Japan
| |
Collapse
|
35
|
Mousavian Z, Masoudi-Nejad A. Drug-target interaction prediction via chemogenomic space: learning-based methods. Expert Opin Drug Metab Toxicol 2014; 10:1273-87. [PMID: 25112457 DOI: 10.1517/17425255.2014.950222] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
INTRODUCTION Identification of the interaction between drugs and target proteins is a crucial task in genomic drug discovery. The in silico prediction is an appropriate alternative for the laborious and costly experimental process of drug-target interaction prediction. Developing a variety of computational methods opens a new direction in analyzing and detecting new drug-target pairs. AREAS COVERED In this review, we will focus on chemogenomic methods which have established a learning framework for predicting drug-target interactions. Learning-based methods are classified into supervised and semi-supervised, and the supervised learning methods are studied as two separate parts including similarity-based methods and feature-based methods. EXPERT OPINION In spite of many improvements for pharmacology applications by learning-based methods, there are many over simplification settings in construction of predictive models that may lead to over-optimistic results on drug-target interaction prediction.
Collapse
Affiliation(s)
- Zaynab Mousavian
- University of Tehran, Institute of Biochemistry and Biophysics, Laboratory of Systems Biology and Bioinformatics (LBB) , Tehran , Iran +98 21 6695 9256 ; +98 21 6640 4680 ;
| | | |
Collapse
|
36
|
An Y, Li Q, Chen J, Gao X, Chen H, Xiao C, Bian L, Zheng J, Zhao X, Zheng X. Binding of caffeic acid to human serum albumin by the retention data and frontal analysis. Biomed Chromatogr 2014; 28:1881-6. [DOI: 10.1002/bmc.3238] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 03/26/2014] [Accepted: 04/09/2014] [Indexed: 12/31/2022]
Affiliation(s)
- Yuxin An
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Qian Li
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Jiejun Chen
- China National Center for Biotechnology Development; Beijing 100036 China
| | - Xiaokang Gao
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Hongwei Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Chaoni Xiao
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Liujiao Bian
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Jianbin Zheng
- Institute of Analytical Science; Northwest University; Xi'an 710069 China
| | - Xinfeng Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| | - Xiaohui Zheng
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences; Northwest University; Xi'an 710069 China
| |
Collapse
|
37
|
Pahikkala T, Airola A, Pietilä S, Shakyawar S, Szwajda A, Tang J, Aittokallio T. Toward more realistic drug-target interaction predictions. Brief Bioinform 2014; 16:325-37. [PMID: 24723570 PMCID: PMC4364066 DOI: 10.1093/bib/bbu010] [Citation(s) in RCA: 245] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
A number of supervised machine learning models have recently been introduced for the prediction of drug-target interactions based on chemical structure and genomic sequence information. Although these models could offer improved means for many network pharmacology applications, such as repositioning of drugs for new therapeutic uses, the prediction models are often being constructed and evaluated under overly simplified settings that do not reflect the real-life problem in practical applications. Using quantitative drug-target bioactivity assays for kinase inhibitors, as well as a popular benchmarking data set of binary drug-target interactions for enzyme, ion channel, nuclear receptor and G protein-coupled receptor targets, we illustrate here the effects of four factors that may lead to dramatic differences in the prediction results: (i) problem formulation (standard binary classification or more realistic regression formulation), (ii) evaluation data set (drug and target families in the application use case), (iii) evaluation procedure (simple or nested cross-validation) and (iv) experimental setting (whether training and test sets share common drugs and targets, only drugs or targets or neither). Each of these factors should be taken into consideration to avoid reporting overoptimistic drug-target interaction prediction results. We also suggest guidelines on how to make the supervised drug-target interaction prediction studies more realistic in terms of such model formulations and evaluation setups that better address the inherent complexity of the prediction task in the practical applications, as well as novel benchmarking data sets that capture the continuous nature of the drug-target interactions for kinase inhibitors.
Collapse
|
38
|
Wang YC, Deng N, Chen S, Wang Y. Computational Study of Drugs by Integrating Omics Data with Kernel Methods. Mol Inform 2013; 32:930-41. [PMID: 27481139 DOI: 10.1002/minf.201300090] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2013] [Accepted: 11/13/2013] [Indexed: 01/02/2023]
Abstract
With the rapid development of genomic and chemogenomic techniques, many omics data sources for drugs have been publicly available. These data sources illustrate drug's biological function in the living cell from different levels and different aspects. One straightforward idea is to learn understandable rules via computational models and algorithms to mine and integrate these data sources. Here, we review our recent efforts on developing kernel-based methods to integrate drug related omics data sources. Three promising applications of our framework are shown to predict drug targets, assign drug's ATC-code annotation, and reveal drug repositioning. We demonstrate that data integration does provide more information and improve the accuracy by recovering more experimentally observed target proteins, ATC-codes, and drug repositioning. Importantly, data integration can indicate novel predictions which are supported by database search and functional annotation analysis and worthy of further experimental validation. In conclusion, kernel methods can efficiently integrate heterogeneous data sources to computationally study drugs, and will promote the further research in drug discovery in a low-cost way.
Collapse
Affiliation(s)
- Yongcui C Wang
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, No. 23, Xinning Road, Xining, Qinghai Province, P. R. China
| | - Naiyang Deng
- College of Science, China Agriculture University, No. 17. Qinghua East Road, Beijing, P. R. China
| | - Shilong Chen
- Key Laboratory of Adaptation and Evolution of Plateau Biota, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, No. 23, Xinning Road, Xining, Qinghai Province, P. R. China.
| | - Yong Wang
- National Centre for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, N0.55, Zhongguancun East Road, Beijing, P. R. China. .,Molecular Profiling Research Center for Drug Discovery, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan.
| |
Collapse
|
39
|
|
40
|
Kim S, Jin D, Lee H. Predicting drug-target interactions using drug-drug interactions. PLoS One 2013; 8:e80129. [PMID: 24278248 PMCID: PMC3836969 DOI: 10.1371/journal.pone.0080129] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 09/30/2013] [Indexed: 11/18/2022] Open
Abstract
Computational methods for predicting drug-target interactions have become important in drug research because they can help to reduce the time, cost, and failure rates for developing new drugs. Recently, with the accumulation of drug-related data sets related to drug side effects and pharmacological data, it has became possible to predict potential drug-target interactions. In this study, we focus on drug-drug interactions (DDI), their adverse effects ([Formula: see text]) and pharmacological information ([Formula: see text]), and investigate the relationship among chemical structures, side effects, and DDIs from several data sources. In this study, [Formula: see text] data from the STITCH database, [Formula: see text] from drugs.com, and drug-target pairs from ChEMBL and SIDER were first collected. Then, by applying two machine learning approaches, a support vector machine (SVM) and a kernel-based L1-norm regularized logistic regression (KL1LR), we showed that DDI is a promising feature in predicting drug-target interactions. Next, the accuracies of predicting drug-target interactions using DDI were compared to those obtained using the chemical structure and side effects based on the SVM and KL1LR approaches, showing that DDI was the data source contributing the most for predicting drug-target interactions.
Collapse
Affiliation(s)
- Shinhyuk Kim
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Daeyong Jin
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Hyunju Lee
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
- * E-mail:
| |
Collapse
|
41
|
Pressor mechanism evaluation for phytochemical compounds using in silico compound–protein interaction prediction. Regul Toxicol Pharmacol 2013; 67:115-24. [DOI: 10.1016/j.yrtph.2013.07.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 07/20/2013] [Accepted: 07/22/2013] [Indexed: 01/30/2023]
|
42
|
Sugaya N. Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. J Chem Inf Model 2013; 53:2525-37. [PMID: 24020509 DOI: 10.1021/ci400240u] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Machine learning methods based on ligand-protein interaction data in bioactivity databases are one of the current strategies for efficiently finding novel lead compounds as the first step in the drug discovery process. Although previous machine learning studies have succeeded in predicting novel ligand-protein interactions with high performance, all of the previous studies to date have been heavily dependent on the simple use of raw bioactivity data of ligand potencies measured by IC50, EC50, K(i), and K(d) deposited in databases. ChEMBL provides us with a unique opportunity to investigate whether a machine-learning-based classifier created by reflecting ligand efficiency other than the IC50, EC50, K(i), and Kd values can also offer high predictive performance. Here we report that classifiers created from training data based on ligand efficiency show higher performance than those from data based on IC50 or K(i) values. Utilizing GPCRSARfari and KinaseSARfari databases in ChEMBL, we created IC50- or K(i)-based training data and binding efficiency index (BEI) based training data then constructed classifiers using support vector machines (SVMs). The SVM classifiers from the BEI-based training data showed slightly higher area under curve (AUC), accuracy, sensitivity, and specificity in the cross-validation tests. Application of the classifiers to the validation data demonstrated that the AUCs and specificities of the BEI-based classifiers dramatically increased in comparison with the IC50- or K(i)-based classifiers. The improvement of the predictive power by the BEI-based classifiers can be attributed to (i) the more separated distributions of positives and negatives, (ii) the higher diversity of negatives in the BEI-based training data in a feature space of SVMs, and (iii) a more balanced number of positives and negatives in the BEI-based training data. These results strongly suggest that training data based on ligand efficiency as well as data based on classical IC50, EC50, K(d), and K(i) values are important when creating a classifier using a machine learning approach based on bioactivity data.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc. , Hatchobori 2-19-8, Chuo-ku, Tokyo, 104-0032, Japan
| |
Collapse
|
43
|
Xu H, Tao Y, Lu P, Wang P, Zhang F, Yuan Y, Wang S, Xiao X, Yang H, Huang L. A computational drug-target network for yuanhu zhitong prescription. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2013; 2013:658531. [PMID: 23762151 PMCID: PMC3665234 DOI: 10.1155/2013/658531] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 03/10/2013] [Indexed: 12/16/2022]
Abstract
Yuanhu Zhitong prescription (YZP) is a typical and relatively simple traditional Chinese medicine (TCM), widely used in the clinical treatment of headache, gastralgia, and dysmenorrhea. However, the underlying molecular mechanism of action of YZP is not clear. In this study, based on the previous chemical and metabolite analysis, a complex approach including the prediction of the structure of metabolite, high-throughput in silico screening, and network reconstruction and analysis was developed to obtain a computational drug-target network for YZP. This was followed by a functional and pathway analysis by ClueGO to determine some of the pharmacologic activities. Further, two new pharmacologic actions, antidepressant and antianxiety, of YZP were validated by animal experiments using zebrafish and mice models. The forced swimming test and the tail suspension test demonstrated that YZP at the doses of 4 mg/kg and 8 mg/kg had better antidepressive activity when compared with the control group. The anxiolytic activity experiment showed that YZP at the doses of 100 mg/L, 150 mg/L, and 200 mg/L had significant decrease in diving compared to controls. These results not only shed light on the better understanding of the molecular mechanisms of YZP for curing diseases, but also provide some evidence for exploring the classic TCM formulas for new clinical application.
Collapse
Affiliation(s)
- Haiyu Xu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
| | - Ye Tao
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
- Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China
| | - Peng Lu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Peng Wang
- Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China
| | - Fangbo Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
| | - Yuan Yuan
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
| | | | - Xuefeng Xiao
- Tianjin University of Traditional Chinese Medicine, Tianjin 300193, China
| | - Hongjun Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
| | - Luqi Huang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, No. 16 Nanxiaojie, Dongzhimennei, Beijing 100700, China
| |
Collapse
|
44
|
Wang YC, Chen SL, Deng NY, Wang Y. Network predicting drug’s anatomical therapeutic chemical code. Bioinformatics 2013; 29:1317-24. [DOI: 10.1093/bioinformatics/btt158] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
45
|
Wu Z, Wang Y, Chen L. Network-based drug repositioning. MOLECULAR BIOSYSTEMS 2013; 9:1268-81. [PMID: 23493874 DOI: 10.1039/c3mb25382a] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Network-based computational biology, with the emphasis on biomolecular interactions and omics-data integration, has had success in drug development and created new directions such as drug repositioning and drug combination. Drug repositioning, i.e., revealing a drug's new roles, is increasingly attracting much attention from the pharmaceutical community to tackle the problems of high failure rate and long-term development in drug discovery. While drug combination or drug cocktails, i.e., combining multiple drugs against diseases, mainly aims to alleviate the problems of the recurrent emergence of drug resistance and also reveal their synergistic effects. In this paper, we unify the two topics to reveal new roles of drug interactions from a network perspective by treating drug combination as another form of drug repositioning. In particular, first, we emphasize that rationally repositioning drugs in the large scale is driven by the accumulation of various high-throughput genome-wide data. These data can be utilized to capture the interplay among targets and biological molecules, uncover the resulting network structures, and further bridge molecular profiles and phenotypes. This motivates many network-based computational methods on these topics. Second, we organize these existing methods into two categories, i.e., single drug repositioning and drug combination, and further depict their main features by three data sources. Finally, we discuss the merits and shortcomings of these methods and pinpoint some future topics in this promising field.
Collapse
Affiliation(s)
- Zikai Wu
- Business School, University of Shanghai for Science and Technology, Shanghai, China
| | | | | |
Collapse
|
46
|
Wang J, Zhang Y, Marian C, Ressom HW. Identification of aberrant pathways and network activities from high-throughput data. Brief Bioinform 2012; 13:406-19. [PMID: 22287794 PMCID: PMC3404398 DOI: 10.1093/bib/bbs001] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 01/03/2012] [Indexed: 02/06/2023] Open
Abstract
Many complex diseases such as cancer are associated with changes in biological pathways and molecular networks rather than being caused by single gene alterations. A major challenge in the diagnosis and treatment of such diseases is to identify characteristic aberrancies in the biological pathways and molecular network activities and elucidate their relationship to the disease. This review presents recent progress in using high-throughput biological assays to decipher aberrant pathways and network activities. In particular, this review provides specific examples in which high-throughput data have been applied to identify relationships between diseases and aberrant pathways and network activities. The achievements in this field have been remarkable, but many challenges have yet to be addressed.
Collapse
|