201
|
Yu L, Zhao J, Gao L. Predicting Potential Drugs for Breast Cancer based on miRNA and Tissue Specificity. Int J Biol Sci 2018; 14:971-982. [PMID: 29989066 PMCID: PMC6036744 DOI: 10.7150/ijbs.23350] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 12/14/2017] [Indexed: 02/01/2023] Open
Abstract
Network-based computational method, with the emphasis on biomolecular interactions and biological data integration, has succeeded in drug development and created new directions, such as drug repositioning and drug combination. Drug repositioning, that is finding new uses for existing drugs to treat more patients, offers time, cost and efficiency benefits in drug development, especially when in silico techniques are used. MicroRNAs (miRNAs) play important roles in multiple biological processes and have attracted much scientific attention recently. Moreover, cumulative studies demonstrate that the mature miRNAs as well as their precursors can be targeted by small molecular drugs. At the same time, human diseases result from the disordered interplay of tissue- and cell lineage-specific processes. However, few computational researches predict drug-disease potential relationships based on miRNA data and tissue specificity. Therefore, based on miRNA data and the tissue specificity of diseases, we propose a new method named as miTS to predict the potential treatments for diseases. Firstly, based on miRNAs data, target genes and information of FDA (Food and Drug Administration) approved drugs, we evaluate the relationships between miRNAs and drugs in the tissue-specific PPI (protein-protein) network. Then, we construct a tripartite network: drug-miRNA-disease Finally, we obtain the potential drug-disease associations based on the tripartite network. In this paper, we take breast cancer as case study and focus on the top-30 predicted drugs. 25 of them (83.3%) are found having known connections with breast cancer in CTD (Comparative Toxicogenomics Database) benchmark and the other 5 drugs are potential drugs for breast cancer. We further evaluate the 5 newly predicted drugs from clinical records, literature mining, KEGG pathways enrichment analysis and overlapping genes between enriched pathways. For each of the 5 new drugs, strongly supported evidences can be found in three or more aspects. In particular, Regorafenib (DB08896) has 15 overlapping KEGG pathways with breast cancer and their p-values are all very small. In addition, whether in the literature curation or clinical validation, Regorafenib has a strong correlation with breast cancer. All the facts show that Regorafenib is likely to be a truly effective drug, worthy of our further study. It further follows that our method miTS is effective and practical for predicting new drug indications, which will provide potential values for treatments of complex diseases.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, P.R. China
| | - Jin Zhao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, P.R. China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, P.R. China
| |
Collapse
|
202
|
Huang G, Li J, Zhao C. Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites. Molecules 2018; 23:molecules23040954. [PMID: 29671802 PMCID: PMC6017196 DOI: 10.3390/molecules23040954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 03/30/2018] [Accepted: 04/09/2018] [Indexed: 01/12/2023] Open
Abstract
Interactions between drugs and proteins occupy a central position during the process of drug discovery and development. Numerous methods have recently been developed for identifying drug–target interactions, but few have been devoted to finding interactions between post-translationally modified proteins and drugs. We presented a machine learning-based method for identifying associations between small molecules and binding-associated S-nitrosylated (SNO-) proteins. Namely, small molecules were encoded by molecular fingerprint, SNO-proteins were encoded by the information entropy-based method, and the random forest was used to train a classifier. Ten-fold and leave-one-out cross validations achieved, respectively, 0.7235 and 0.7490 of the area under a receiver operating characteristic curve. Computational analysis of similarity suggested that SNO-proteins associated with the same drug shared statistically significant similarity, and vice versa. This method and finding are useful to identify drug–SNO associations and further facilitate the discovery and development of SNO-associated drugs.
Collapse
Affiliation(s)
- Guohua Huang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Jincheng Li
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Chenglin Zhao
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| |
Collapse
|
203
|
Zong N, Kim H, Ngo V, Harismendy O. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations. Bioinformatics 2018; 33:2337-2344. [PMID: 28430977 DOI: 10.1093/bioinformatics/btx160] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 03/21/2017] [Indexed: 12/20/2022] Open
Abstract
Motivation A heterogeneous network topology possessing abundant interactions between biomedical entities has yet to be utilized in similarity-based methods for predicting drug-target associations based on the array of varying features of drugs and their targets. Deep learning reveals features of vertices of a large network that can be adapted in accommodating the similarity-based solutions to provide a flexible method of drug-target prediction. Results We propose a similarity-based drug-target prediction method that enhances existing association discovery methods by using a topology-based similarity measure. DeepWalk, a deep learning method, is adopted in this study to calculate the similarities within Linked Tripartite Network (LTN), a heterogeneous network generated from biomedical linked datasets. This proposed method shows promising results for drug-target association prediction: 98.96% AUC ROC score with a 10-fold cross-validation and 99.25% AUC ROC score with a Monte Carlo cross-validation with LTN. By utilizing DeepWalk, we demonstrate that: (i) this method outperforms other existing topology-based similarity computation methods, (ii) the performance is better for tripartite than with bipartite networks and (iii) the measure of similarity using network topology outperforms the ones derived from chemical structure (drugs) or genomic sequence (targets). Our proposed methodology proves to be capable of providing a promising solution for drug-target prediction based on topological similarity with a heterogeneous network, and may be readily re-purposed and adapted in the existing of similarity-based methodologies. Availability and Implementation The proposed method has been developed in JAVA and it is available, along with the data at the following URL: https://github.com/zongnansu1982/drug-target-prediction . Contact nazong@ucsd.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nansu Zong
- Department of Biomedical Informatics, School of Medicine, UC, San Diego, CA 92093, USA
| | - Hyeoneui Kim
- Department of Biomedical Informatics, School of Medicine, UC, San Diego, CA 92093, USA
| | - Victoria Ngo
- Betty Irene Moore School of Nursing, UC Davis, Sacramento, CA 95817, USA
| | - Olivier Harismendy
- Department of Biomedical Informatics, School of Medicine, UC, San Diego, CA 92093, USA.,Moores Cancer Center, UC, San Diego, CA 92093, USA
| |
Collapse
|
204
|
Brown N, Cambruzzi J, Cox PJ, Davies M, Dunbar J, Plumbley D, Sellwood MA, Sim A, Williams-Jones BI, Zwierzyna M, Sheppard DW. Big Data in Drug Discovery. PROGRESS IN MEDICINAL CHEMISTRY 2018; 57:277-356. [PMID: 29680150 DOI: 10.1016/bs.pmch.2017.12.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Aaron Sim
- BenevolentAI, London, United Kingdom
| | | | - Magdalena Zwierzyna
- BenevolentAI, London, United Kingdom; Institute of Cardiovascular Science, University College London, London, United Kingdom
| | | |
Collapse
|
205
|
Chen X, You ZH, Yan GY, Gong DW. IRWRLDA: improved random walk with restart for lncRNA-disease association prediction. Oncotarget 2018; 7:57919-57931. [PMID: 27517318 PMCID: PMC5295400 DOI: 10.18632/oncotarget.11141] [Citation(s) in RCA: 146] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 07/06/2016] [Indexed: 12/11/2022] Open
Abstract
In recent years, accumulating evidences have shown that the dysregulations of lncRNAs are associated with a wide range of human diseases. It is necessary and feasible to analyze known lncRNA-disease associations, predict potential lncRNA-disease associations, and provide the most possible lncRNA-disease pairs for experimental validation. Considering the limitations of traditional Random Walk with Restart (RWR), the model of Improved Random Walk with Restart for LncRNA-Disease Association prediction (IRWRLDA) was developed to predict novel lncRNA-disease associations by integrating known lncRNA-disease associations, disease semantic similarity, and various lncRNA similarity measures. The novelty of IRWRLDA lies in the incorporation of lncRNA expression similarity and disease semantic similarity to set the initial probability vector of the RWR. Therefore, IRWRLDA could be applied to diseases without any known related lncRNAs. IRWRLDA significantly improved previous classical models with reliable AUCs of 0.7242 and 0.7872 in two known lncRNA-disease association datasets downloaded from the lncRNADisease database, respectively. Further case studies of colon cancer and leukemia were implemented for IRWRLDA and 60% of lncRNAs in the top 10 prediction lists have been confirmed by recent experimental reports.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Zhu-Hong You
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.,National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China
| | - Dun-Wei Gong
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
206
|
A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Sci Rep 2018; 8:3355. [PMID: 29463808 PMCID: PMC5820329 DOI: 10.1038/s41598-018-21622-4] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 02/06/2018] [Indexed: 02/01/2023] Open
Abstract
An enduring challenge in personalized medicine lies in selecting a suitable drug for each individual patient. Here we concentrate on predicting drug responses based on a cohort of genomic, chemical structure, and target information. Therefore, a recently study such as GDSC has provided an unprecedented opportunity to infer the potential relationships between cell line and drug. While existing approach rely primarily on regression, classification or multiple kernel learning to predict drug responses. Synthetic approach indicates drug target and protein-protein interaction could have the potential to improve the prediction performance of drug response. In this study, we propose a novel heterogeneous network-based method, named as HNMDRP, to accurately predict cell line-drug associations through incorporating heterogeneity relationship among cell line, drug and target. Compared to previous study, HNMDRP can make good use of above heterogeneous information to predict drug responses. The validity of our method is verified not only by plotting the ROC curve, but also by predicting novel cell line-drug sensitive associations which have dependable literature evidences. This allows us possibly to suggest potential sensitive associations among cell lines and drugs. Matlab and R codes of HNMDRP can be found at following https://github.com/USTC-HIlab/HNMDRP.
Collapse
|
207
|
Ezzat A, Wu M, Li XL, Kwoh CK. Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey. Brief Bioinform 2018; 20:1337-1357. [DOI: 10.1093/bib/bby002] [Citation(s) in RCA: 117] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 12/21/2017] [Indexed: 01/18/2023] Open
Abstract
Abstract
Computational prediction of drug–target interactions (DTIs) has become an essential task in the drug discovery process. It narrows down the search space for interactions by suggesting potential interaction candidates for validation via wet-lab experiments that are well known to be expensive and time-consuming. In this article, we aim to provide a comprehensive overview and empirical evaluation on the computational DTI prediction techniques, to act as a guide and reference for our fellow researchers. Specifically, we first describe the data used in such computational DTI prediction efforts. We then categorize and elaborate the state-of-the-art methods for predicting DTIs. Next, an empirical comparison is performed to demonstrate the prediction performance of some representative methods under different scenarios. We also present interesting findings from our evaluation study, discussing the advantages and disadvantages of each method. Finally, we highlight potential avenues for further enhancement of DTI prediction performance as well as related research directions.
Collapse
|
208
|
Abstract
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Collapse
|
209
|
|
210
|
Jiang J, Xing F, Wang C, Zeng X. Identification and Analysis of Rice Yield-Related Candidate Genes by Walking on the Functional Network. FRONTIERS IN PLANT SCIENCE 2018; 9:1685. [PMID: 30524460 PMCID: PMC6262309 DOI: 10.3389/fpls.2018.01685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 10/30/2018] [Indexed: 05/04/2023]
Abstract
Rice (Oryza sativa L.) is one of the most important staple foods in the world. It is possible to identify candidate genes associated with rice yield using the model of random walk with restart on a functional similarity network. We demonstrated the high performance of this approach by a five-fold cross-validation experiment, as well as the robustness of the parameter r. We also assessed the strength of associations between known seeds and candidate genes in the light of the results scores. The candidates ranking at the top of the results list were considered to be the most relevant rice yield-related genes. This study provides a valuable alternative for rice breeding and biology research. The relevant dataset and script can be downloaded at the website: http://lab.malab.cn/jj/rice.htm.
Collapse
Affiliation(s)
- Jing Jiang
- School of Aerospace Engineering, Xiamen University, Xiamen, China
| | - Fei Xing
- School of Aerospace Engineering, Xiamen University, Xiamen, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
- *Correspondence: Chunyu Wang, Xiangxiang Zeng,
| | - Xiangxiang Zeng
- School of Information Science and Engineering, Xiamen University, Xiamen, China
- *Correspondence: Chunyu Wang, Xiangxiang Zeng,
| |
Collapse
|
211
|
Drug-Target Interaction Prediction in Drug Repositioning Based on Deep Semi-Supervised Learning. COMPUTATIONAL INTELLIGENCE AND ITS APPLICATIONS 2018. [DOI: 10.1007/978-3-319-89743-1_27] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
212
|
Deciphering the Relationship between Obesity and Various Diseases from a Network Perspective. Genes (Basel) 2017; 8:genes8120392. [PMID: 29258237 PMCID: PMC5748710 DOI: 10.3390/genes8120392] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 12/02/2017] [Accepted: 12/13/2017] [Indexed: 12/14/2022] Open
Abstract
The number of obesity cases is rapidly increasing in developed and developing countries, thereby causing significant health problems worldwide. The pathologic factors of obesity at the molecular level are not fully characterized, although the imbalance between energy intake and consumption is widely recognized as the main reason for fat accumulation. Previous studies reported that obesity can be caused by the dysfunction of genes associated with other diseases, such as myocardial infarction, hence providing new insights into dissecting the pathogenesis of obesity by investigating its associations with other diseases. In this study, we investigated the relationship between obesity and diseases from Online Mendelian Inheritance in Man (OMIM) databases on the protein–protein interaction (PPI) network. The obesity genes and genes of one OMIM disease were mapped onto the network, and the interaction scores between the two gene sets were investigated on the basis of the PPI of individual gene pairs, thereby inferring the relationship between obesity and this disease. Results suggested that diseases related to nutrition and endocrine are the top two diseases that are closely associated with obesity. This finding is consistent with our general knowledge and indicates the reliability of our obtained results. Moreover, we inferred that diseases related to psychiatric factors and bone may also be highly related to obesity because the two diseases followed the diseases related to nutrition and endocrine according to our results. Numerous obesity–disease associations were identified in the literature to confirm the relationships between obesity and the aforementioned four diseases. These new results may help understand the underlying molecular mechanisms of obesity–disease co-occurrence and provide useful insights for disease prevention and intervention.
Collapse
|
213
|
iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting. Sci Rep 2017; 7:17731. [PMID: 29255285 PMCID: PMC5735173 DOI: 10.1038/s41598-017-18025-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 12/05/2017] [Indexed: 02/07/2023] Open
Abstract
Prediction of new drug-target interactions is critically important as it can lead the researchers to find new uses for old drugs and to disclose their therapeutic profiles or side effects. However, experimental prediction of drug-target interactions is expensive and time-consuming. As a result, computational methods for predictioning new drug-target interactions have gained a tremendous interest in recent times. Here we present iDTI-ESBoost, a prediction model for identification of drug-target interactions using evolutionary and structural features. Our proposed method uses a novel data balancing and boosting technique to predict drug-target interaction. On four benchmark datasets taken from a gold standard data, iDTI-ESBoost outperforms the state-of-the-art methods in terms of area under receiver operating characteristic (auROC) curve. iDTI-ESBoost also outperforms the latest and the best-performing method found in the literature in terms of area under precision recall (auPR) curve. This is significant as auPR curves are argued as suitable metric for comparison for imbalanced datasets similar to the one studied here. Our reported results show the effectiveness of the classifier, balancing methods and the novel features incorporated in iDTI-ESBoost. iDTI-ESBoost is a novel prediction method that has for the first time exploited the structural features along with the evolutionary features to predict drug-protein interactions. We believe the excellent performance of iDTI-ESBoost both in terms of auROC and auPR would motivate the researchers and practitioners to use it to predict drug-target interactions. To facilitate that, iDTI-ESBoost is implemented and made publicly available at: http://farshidrayhan.pythonanywhere.com/iDTI-ESBoost/.
Collapse
|
214
|
Chen X, Huang YA, You ZH, Yan GY, Wang XS. A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics 2017; 33:733-739. [PMID: 28025197 DOI: 10.1093/bioinformatics/btw715] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/09/2016] [Indexed: 12/19/2022] Open
Abstract
Motivation Accumulating clinical observations have indicated that microbes living in the human body are closely associated with a wide range of human noninfectious diseases, which provides promising insights into the complex disease mechanism understanding. Predicting microbe-disease associations could not only boost human disease diagnostic and prognostic, but also improve the new drug development. However, little efforts have been attempted to understand and predict human microbe-disease associations on a large scale until now. Results In this work, we constructed a microbe-human disease association network and further developed a novel computational model of KATZ measure for Human Microbe-Disease Association prediction (KATZHMDA) based on the assumption that functionally similar microbes tend to have similar interaction and non-interaction patterns with noninfectious diseases, and vice versa. To our knowledge, KATZHMDA is the first tool for microbe-disease association prediction. The reliable prediction performance could be attributed to the use of KATZ measurement, and the introduction of Gaussian interaction profile kernel similarity for microbes and diseases. LOOCV and k-fold cross validation were implemented to evaluate the effectiveness of this novel computational model based on known microbe-disease associations obtained from HMDAD database. As a result, KATZHMDA achieved reliable performance with average AUCs of 0.8130 ± 0.0054, 0.8301 ± 0.0033 and 0.8382 in 2-fold and 5-fold cross validation and LOOCV framework, respectively. It is anticipated that KATZHMDA could be used to obtain more novel microbes associated with important noninfectious human diseases and therefore benefit drug discovery and human medical improvement. Availability and Implementation Matlab codes and dataset explored in this work are available at http://dwz.cn/4oX5mS . Contacts xingchen@amss.ac.cn or zhuhongyou@gmail.com or wangxuesongcumt@163.com. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Yu-An Huang
- Department of Computing, Hong Kong Polytechnic University, Hong Kong
| | - Zhu-Hong You
- Chinese Academy of Science, Xinjiang Technical Institute of Physics and Chemistry, Ürümqi 830011, China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Xue-Song Wang
- School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
215
|
Zhang W, Chen Y, Li D. Drug-Target Interaction Prediction through Label Propagation with Linear Neighborhood Information. Molecules 2017; 22:molecules22122056. [PMID: 29186828 PMCID: PMC6149680 DOI: 10.3390/molecules22122056] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Revised: 11/19/2017] [Accepted: 11/20/2017] [Indexed: 11/16/2022] Open
Abstract
Interactions between drugs and target proteins provide important information for the drug discovery. Currently, experiments identified only a small number of drug-target interactions. Therefore, the development of computational methods for drug-target interaction prediction is an urgent task of theoretical interest and practical significance. In this paper, we propose a label propagation method with linear neighborhood information (LPLNI) for predicting unobserved drug-target interactions. Firstly, we calculate drug-drug linear neighborhood similarity in the feature spaces, by considering how to reconstruct data points from neighbors. Then, we take similarities as the manifold of drugs, and assume the manifold unchanged in the interaction space. At last, we predict unobserved interactions between known drugs and targets by using drug-drug linear neighborhood similarity and known drug-target interactions. The experiments show that LPLNI can utilize only known drug-target interactions to make high-accuracy predictions on four benchmark datasets. Furthermore, we consider incorporating chemical structures into LPLNI models. Experimental results demonstrate that the model with integrated information (LPLNI-II) can produce improved performances, better than other state-of-the-art methods. The known drug-target interactions are an important information source for computational predictions. The usefulness of the proposed method is demonstrated by cross validation and the case study.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan 430072, China.
| | - Yanlin Chen
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China.
| | - Dingfang Li
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
216
|
Wu Z, Cheng F, Li J, Li W, Liu G, Tang Y. SDTNBI: an integrated network and chemoinformatics tool for systematic prediction of drug-target interactions and drug repositioning. Brief Bioinform 2017; 18:333-347. [PMID: 26944082 DOI: 10.1093/bib/bbw012] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Indexed: 01/11/2023] Open
Abstract
Computational prediction of drug-target interactions (DTIs) and drug repositioning provides a low-cost and high-efficiency approach for drug discovery and development. The traditional social network-derived methods based on the naïve DTI topology information cannot predict potential targets for new chemical entities or failed drugs in clinical trials. There are currently millions of commercially available molecules with biologically relevant representations in chemical databases. It is urgent to develop novel computational approaches to predict targets for new chemical entities and failed drugs on a large scale. In this study, we developed a useful tool, namely substructure-drug-target network-based inference (SDTNBI), to prioritize potential targets for old drugs, failed drugs and new chemical entities. SDTNBI incorporates network and chemoinformatics to bridge the gap between new chemical entities and known DTI network. High performance was yielded in 10-fold and leave-one-out cross validations using four benchmark data sets, covering G protein-coupled receptors, kinases, ion channels and nuclear receptors. Furthermore, the highest areas under the receiver operating characteristic curve were 0.797 and 0.863 for two external validation sets, respectively. Finally, we identified thousands of new potential DTIs via implementing SDTNBI on a global network. As a proof-of-principle, we showcased the use of SDTNBI to identify novel anticancer indications for nonsteroidal anti-inflammatory drugs by inhibiting AKR1C3, CA9 or CA12. In summary, SDTNBI is a powerful network-based approach that predicts potential targets for new chemical entities on a large scale and will provide a new tool for DTI prediction and drug repositioning. The program and predicted DTIs are available on request.
Collapse
Affiliation(s)
- Zengrui Wu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, China
| | - Feixiong Cheng
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Jie Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, China
| |
Collapse
|
217
|
Chen H, Zhang Z, Peng W. miRDDCR: a miRNA-based method to comprehensively infer drug-disease causal relationships. Sci Rep 2017; 7:15921. [PMID: 29162848 PMCID: PMC5698443 DOI: 10.1038/s41598-017-15716-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 10/31/2017] [Indexed: 01/10/2023] Open
Abstract
Revealing the cause-and-effect mechanism behind drug-disease relationships remains a challenging task. Recent studies suggested that drugs can target microRNAs (miRNAs) and alter their expression levels. In the meanwhile, the inappropriate expression of miRNAs will lead to various diseases. Therefore, targeting specific miRNAs by small-molecule drugs to modulate their activities provides a promising approach to human disease treatment. However, few studies attempt to discover drug-disease causal relationships through the molecular level of miRNAs. Here, we developed a miRNA-based inference method miRDDCR to comprehensively predict drug-disease causal relationships. We first constructed a three-layer drug-miRNA-disease heterogeneous network by combining similarity measurements, existing drug-miRNA associations and miRNA-disease associations. Then, we extended the algorithm of Random Walk to the three-layer heterogeneous network and ranked the potential indications for drugs. Leave-one-out cross-validations and case studies demonstrated that our method miRDDCR can achieve excellent prediction power. Compared with related methods, our causality discovery-based algorithm showed superior prediction ability and highlighted the molecular basis miRNAs, which can be used to assist in the experimental design for drug development and disease treatment. Finally, comprehensively inferred drug-disease causal relationships were released for further studies.
Collapse
Affiliation(s)
- Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, China.
| | - Zuping Zhang
- School of Information Science and Engineering, Central South University, Changsha, China
| | - Wei Peng
- Computer Center of Kunming University of Science and Technology, Kunming, China
| |
Collapse
|
218
|
Peng C, Li A, Wang M. Discovery of Bladder Cancer-related Genes Using Integrative Heterogeneous Network Modeling of Multi-omics Data. Sci Rep 2017; 7:15639. [PMID: 29142286 PMCID: PMC5688092 DOI: 10.1038/s41598-017-15890-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 11/02/2017] [Indexed: 02/06/2023] Open
Abstract
In human health, a fundamental challenge is the identification of disease-related genes. Bladder cancer (BC) is a worldwide malignant tumor, which has resulted in 170,000 deaths in 2010 up from 114,000 in 1990. Moreover, with the emergence of multi-omics data, more comprehensive analysis of human diseases become possible. In this study, we propose a multi-step approach for the identification of BC-related genes by using integrative Heterogeneous Network Modeling of Multi-Omics data (iHNMMO). The heterogeneous network model properly and comprehensively reflects the multiple kinds of relationships between genes in the multi-omics data of BC, including general relationships, unique relationships under BC condition, correlational relationships within each omics and regulatory relationships between different omics. Besides, a network-based propagation algorithm with resistance is utilized to quantize the relationships between genes and BC precisely. The results of comprehensive performance evaluation suggest that iHNMMO significantly outperforms other approaches. Moreover, further analysis suggests that the top ranked genes may be functionally implicated in BC, which also confirms the superiority of iHNMMO. In summary, this study shows that disease-related genes can be better identified through reasonable integration of multi-omics data.
Collapse
Affiliation(s)
- Chen Peng
- School of Information Science and Technology, University of Science and Technology of China, Hefei, AH230027, China
- Institute of Machine Learning and Systems Biology, College of Electronics and Information Engineering, Tongji University, Shanghai, 201804, P.R. China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei, AH230027, China.
- Centers for Biomedical Engineering, University of Science and Technology of China, Hefei, AH230037, China.
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei, AH230027, China
- Centers for Biomedical Engineering, University of Science and Technology of China, Hefei, AH230037, China
| |
Collapse
|
219
|
Le DH, Verbeke L, Son LH, Chu DT, Pham VH. Random walks on mutual microRNA-target gene interaction network improve the prediction of disease-associated microRNAs. BMC Bioinformatics 2017; 18:479. [PMID: 29137601 PMCID: PMC5686822 DOI: 10.1186/s12859-017-1924-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 11/06/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. RESULTS Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. CONCLUSIONS Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks.
Collapse
Affiliation(s)
- Duc-Hau Le
- Vinmec Research Institute of Stem Cell and Gene Technology, 458 Minh Khai, Hai Ba Trung, Hanoi, Vietnam
| | - Lieven Verbeke
- Department of Information Technology, Ghent University - imec, Ghent, Belgium
| | - Le Hoang Son
- VNU University of Science, Vietnam National University, Hanoi, Vietnam
| | - Dinh-Toi Chu
- Faculty of Biology, Hanoi National University of Education, Hanoi, Vietnam.,Institute of Research and Development, Duy Tan University, 03 Quang Trung, Da Nang, Vietnam
| | - Van-Huy Pham
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam.
| |
Collapse
|
220
|
El-Hachem N, Ba-Alawi W, Smith I, Mer AS, Haibe-Kains B. Integrative cancer pharmacogenomics to establish drug mechanism of action: drug repurposing. Pharmacogenomics 2017; 18:1469-1472. [PMID: 29057710 DOI: 10.2217/pgs-2017-0132] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Nehme El-Hachem
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Wail Ba-Alawi
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Ian Smith
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Arvind Singh Mer
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Ontario Institute of Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
221
|
You ZH, Wang LP, Chen X, Zhang S, Li XF, Yan GY, Li ZW. PRMDA: personalized recommendation-based MiRNA-disease association prediction. Oncotarget 2017; 8:85568-85583. [PMID: 29156742 PMCID: PMC5689632 DOI: 10.18632/oncotarget.20996] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Accepted: 08/29/2017] [Indexed: 12/23/2022] Open
Abstract
Recently, researchers have been increasingly focusing on microRNAs (miRNAs) with accumulating evidence indicating that miRNAs serve as a vital role in various biological processes and dysfunctions of miRNAs are closely related with human complex diseases. Predicting potential associations between miRNAs and diseases is attached considerable significance in the domains of biology, medicine, and bioinformatics. In this study, we developed a computational model of Personalized Recommendation-based MiRNA-Disease Association prediction (PRMDA) to predict potential related miRNA for all diseases by implementing personalized recommendation-based algorithm based on integrated similarity for diseases and miRNAs. PRMDA is a global method capable of prioritizing candidate miRNAs for all diseases simultaneously. Moreover, the model could be applied to diseases without any known associated miRNAs. PRMDA obtained AUC of 0.8315 based on leave-one-out cross validation, which demonstrated that PRMDA could be regarded as a reliable tool for miRNA-disease association prediction. Besides, we implemented PRMDA on the HMDD V1.0 and HMDD V2.0 databases for three kinds of case studies about five important human cancers in order to test the performance of the model from different perspectives. As a result, 92%, 94%, 88%, 96% and 88% out of the top 50 candidate miRNAs predicted by PRMDA for Colon Neoplasms, Esophageal Neoplasms, Lymphoma, Lung Neoplasms and Breast Neoplasms, respectively, were confirmed by experimental reports.
Collapse
Affiliation(s)
- Zhu-Hong You
- Department of Information Engineering, Xijing University, Xi’an, China
| | - Luo-Pin Wang
- International Software School, Wuhan University, Wuhan, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Shanwen Zhang
- Department of Information Engineering, Xijing University, Xi’an, China
| | - Xiao-Fang Li
- Department of Information Engineering, Xijing University, Xi’an, China
| | - Gui-Ying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, Hefei, China
| |
Collapse
|
222
|
Buza K, Peška L. Drug–target interaction prediction with Bipartite Local Models and hubness-aware regression. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.04.055] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
223
|
Drug-target interaction prediction using ensemble learning and dimensionality reduction. Methods 2017; 129:81-88. [DOI: 10.1016/j.ymeth.2017.05.016] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Revised: 04/03/2017] [Accepted: 05/18/2017] [Indexed: 11/23/2022] Open
|
224
|
Li XZ, Zhang SN, Yang XY. Combination of cheminformatics and bioinformatics to explore the chemical basis of the rhizomes and aerial parts of Dioscorea nipponica Makino. J Pharm Pharmacol 2017; 69:1846-1857. [PMID: 28940203 DOI: 10.1111/jphp.12825] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 08/26/2017] [Indexed: 01/09/2023]
Abstract
OBJECTIVES This study was aimed to explore the chemical basis of the rhizomes and aerial parts of Dioscorea nipponica Makino (DN). METHODS The pharmacokinetic profiles of the compounds from DN were calculated via ACD/I-Lab and PreADMET program. Their potential therapeutic and toxicity targets were screened through the DrugBank's or T3DB's ChemQuery structure search. KEY FINDINGS Eleven of 48 compounds in the rhizomes and over half of the compounds in the aerial parts had moderate or good human oral bioavailability. Twenty-three of 48 compounds in the rhizomes and 40/43 compounds from the aerial parts had moderate or good permeability to intestinal cells. Forty-three of 48 compounds from the rhizomes and 18/43 compounds in the aerial parts bound weakly to the plasma proteins. Eleven of 48 compounds in the rhizomes and 36/43 compounds of the aerial parts might pass across the blood-brain barrier. Forty-three 48 compounds in the rhizomes and 18/43 compounds from the aerial parts showed low renal excretion ability. The compounds in the rhizomes possessed 391 potential therapeutic targets and 216 potential toxicity targets. Additionally, the compounds from the aerial parts possessed 101 potential therapeutic targets and 183 potential toxicity targets. CONCLUSIONS These findings indicated that combination of cheminformatics and bioinformatics may facilitate achieving the objectives of this study.
Collapse
Affiliation(s)
- Xu-Zhao Li
- Pharmacy School, Guiyang University of Chinese Medicine, Guiyang, China
| | - Shuai-Nan Zhang
- Pharmacy School, Guiyang University of Chinese Medicine, Guiyang, China
| | - Xu-Yan Yang
- First Affiliated Hospital, Heilongjiang University of Chinese Medicine, Harbin, China
| |
Collapse
|
225
|
Luo Y, Zhao X, Zhou J, Yang J, Zhang Y, Kuang W, Peng J, Chen L, Zeng J. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 2017; 8:573. [PMID: 28924171 PMCID: PMC5603535 DOI: 10.1038/s41467-017-00680-8] [Citation(s) in RCA: 386] [Impact Index Per Article: 55.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 07/19/2017] [Indexed: 02/05/2023] Open
Abstract
The emergence of large-scale genomic, chemical and pharmacological data provides new opportunities for drug discovery and repositioning. In this work, we develop a computational pipeline, called DTINet, to predict novel drug-target interactions from a constructed heterogeneous network, which integrates diverse drug-related information. DTINet focuses on learning a low-dimensional vector representation of features, which accurately explains the topological properties of individual nodes in the heterogeneous network, and then makes prediction based on these representations via a vector space projection scheme. DTINet achieves substantial performance improvement over other state-of-the-art methods for drug-target interaction prediction. Moreover, we experimentally validate the novel interactions between three drugs and the cyclooxygenase proteins predicted by DTINet, and demonstrate the new potential applications of these identified cyclooxygenase inhibitors in preventing inflammatory diseases. These results indicate that DTINet can provide a practically useful tool for integrating heterogeneous information to predict new drug-target interactions and repurpose existing drugs.Network-based data integration for drug-target prediction is a promising avenue for drug repositioning, but performance is wanting. Here, the authors introduce DTINet, whose performance is enhanced in the face of noisy, incomplete and high-dimensional biological data by learning low-dimensional vector representations.
Collapse
Affiliation(s)
- Yunan Luo
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Xinbin Zhao
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China
| | - Jingtian Zhou
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China
| | - Jinglin Yang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Yanqing Zhang
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
| | - Wenhua Kuang
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| | - Ligong Chen
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China.
- Collaborative Innovation Center for Biotherapy, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, 610041, China.
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
226
|
Wang L, You ZH, Chen X, Xia SX, Liu F, Yan X, Zhou Y, Song KJ. A Computational-Based Method for Predicting Drug-Target Interactions by Using Stacked Autoencoder Deep Neural Network. J Comput Biol 2017; 25:361-373. [PMID: 28891684 DOI: 10.1089/cmb.2017.0135] [Citation(s) in RCA: 103] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Identifying the interaction between drugs and target proteins is an important area of drug research, which provides a broad prospect for low-risk and faster drug development. However, due to the limitations of traditional experiments when revealing drug-protein interactions (DTIs), the screening of targets not only takes a lot of time and money but also has high false-positive and false-negative rates. Therefore, it is imperative to develop effective automatic computational methods to accurately predict DTIs in the postgenome era. In this article, we propose a new computational method for predicting DTIs from drug molecular structure and protein sequence by using the stacked autoencoder of deep learning, which can adequately extract the raw data information. The proposed method has the advantage that it can automatically mine the hidden information from protein sequences and generate highly representative features through iterations of multiple layers. The feature descriptors are then constructed by combining the molecular substructure fingerprint information, and fed into the rotation forest for accurate prediction. The experimental results of fivefold cross-validation indicate that the proposed method achieves superior performance on gold standard data sets (enzymes, ion channels, GPCRs [G-protein-coupled receptors], and nuclear receptors) with accuracy of 0.9414, 0.9116, 0.8669, and 0.8056, respectively. We further comprehensively explore the performance of the proposed method by comparing it with other feature extraction algorithms, state-of-the-art classifiers, and other excellent methods on the same data set. The excellent comparison results demonstrate that the proposed method is highly competitive when predicting drug-target interactions.
Collapse
Affiliation(s)
- Lei Wang
- 1 School of Computer Science and Technology, China University of Mining and Technology , Xuzhou, China .,2 College of Information Science and Engineering, Zaozhuang University , Zaozhuang, China
| | - Zhu-Hong You
- 3 Xinjiang Technical Institutes of Physics and Chemistry , Chinese Academy of Science, Urumqi, China
| | - Xing Chen
- 4 School of Information and Control Engineering, China University of Mining and Technology , Xuzhou, China
| | - Shi-Xiong Xia
- 1 School of Computer Science and Technology, China University of Mining and Technology , Xuzhou, China
| | - Feng Liu
- 5 China National Coal Association , Beijing, China
| | - Xin Yan
- 6 School of Foreign Languages, Zaozhuang University , Zaozhuang, China
| | - Yong Zhou
- 1 School of Computer Science and Technology, China University of Mining and Technology , Xuzhou, China
| | - Ke-Jian Song
- 7 School of Information Engineering, JiangXi University of Science and Technology , Ganzhou, China
| |
Collapse
|
227
|
In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences. Sci Rep 2017; 7:11174. [PMID: 28894115 PMCID: PMC5593914 DOI: 10.1038/s41598-017-10724-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Accepted: 08/14/2017] [Indexed: 01/09/2023] Open
Abstract
Analysis of drug–target interactions (DTIs) is of great importance in developing new drug candidates for known protein targets or discovering new targets for old drugs. However, the experimental approaches for identifying DTIs are expensive, laborious and challenging. In this study, we report a novel computational method for predicting DTIs using the highly discriminative information of drug-target interactions and our newly developed discriminative vector machine (DVM) classifier. More specifically, each target protein sequence is transformed as the position-specific scoring matrix (PSSM), in which the evolutionary information is retained; then the local binary pattern (LBP) operator is used to calculate the LBP histogram descriptor. For a drug molecule, a novel fingerprint representation is utilized to describe its chemical structure information representing existence of certain functional groups or fragments. When applying the proposed method to the four datasets (Enzyme, GPCR, Ion Channel and Nuclear Receptor) for predicting DTIs, we obtained good average accuracies of 93.16%, 89.37%, 91.73% and 92.22%, respectively. Furthermore, we compared the performance of the proposed model with that of the state-of-the-art SVM model and other previous methods. The achieved results demonstrate that our method is effective and robust and can be taken as a useful tool for predicting DTIs.
Collapse
|
228
|
Zou S, Zhang J, Zhang Z. A novel approach for predicting microbe-disease associations by bi-random walk on the heterogeneous network. PLoS One 2017; 12:e0184394. [PMID: 28880967 PMCID: PMC5589230 DOI: 10.1371/journal.pone.0184394] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 08/23/2017] [Indexed: 02/07/2023] Open
Abstract
Since the microbiome has a significant impact on human health and disease, microbe-disease associations can be utilized as a valuable resource for understanding disease pathogenesis and promoting disease diagnosis and prognosis. Accordingly, it is necessary for researchers to achieve a comprehensive and deep understanding of the associations between microbes and diseases. Nevertheless, to date, little work has been achieved in implementing novel human microbe-disease association prediction models. In this paper, we develop a novel computational model to predict potential microbe-disease associations by bi-random walk on the heterogeneous network (BiRWHMDA). The heterogeneous network was constructed by connecting the microbe similarity network and the disease similarity network via known microbe-disease associations. Microbe similarity and disease similarity were calculated by the Gaussian interaction profile kernel similarity measure; moreover, a logistic function was applied to regulate disease similarity. Additionally, leave-one-out cross validation and 5-fold cross validation were implemented to evaluate the predictive performance of our method; both cross validation methods performed well. The leave-one-out cross validation experiment results illustrate that our method outperforms other previously proposed methods. Furthermore, case studies on asthma and inflammatory bowel disease prove the favorable performance of our method. In conclusion, our method can be considered as an effective computational model for predicting novel microbe-disease associations.
Collapse
Affiliation(s)
- Shuai Zou
- School of Information Science and Engineering, Central South University, Changsha, Hunan, China
| | - Jingpu Zhang
- School of Information Science and Engineering, Central South University, Changsha, Hunan, China
| | - Zuping Zhang
- School of Information Science and Engineering, Central South University, Changsha, Hunan, China
| |
Collapse
|
229
|
Peng L, Zhu W, Liao B, Duan Y, Chen M, Chen Y, Yang J. Screening drug-target interactions with positive-unlabeled learning. Sci Rep 2017; 7:8087. [PMID: 28808275 PMCID: PMC5556112 DOI: 10.1038/s41598-017-08079-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 07/04/2017] [Indexed: 02/03/2023] Open
Abstract
Identifying drug-target interaction (DTI) candidates is crucial for drug repositioning. However, usually only positive DTIs are deposited in known databases, which challenges computational methods to predict novel DTIs due to the lack of negative samples. To overcome this dilemma, researchers usually randomly select negative samples from unlabeled drug-target pairs, which introduces a lot of false-positives. In this study, a negative sample extraction method named NDTISE is first developed to screen strong negative DTI examples based on positive-unlabeled learning. A novel DTI screening framework, PUDTI, is then designed to infer new drug repositioning candidates by integrating NDTISE, probabilities that remaining ambiguous samples belong to the positive and negative classes, and an SVM-based optimization model. We investigated the effectiveness of NDTISE on a DTI data provided by NCPIS. NDTISE is much better than random selection and slightly outperforms NCPIS. We then compared PUDTI with 6 state-of-the-art methods on 4 classes of DTI datasets from human enzymes, ion channels, GPCRs and nuclear receptors. PUDTI achieved the highest AUC among the 7 methods on all 4 datasets. Finally, we validated a few top predicted DTIs through mining independent drug databases and literatures. In conclusion, PUDTI provides an effective pre-filtering method for new drug design.
Collapse
Affiliation(s)
- Lihong Peng
- Key Laboratory for Embedded and Network Computing of Hunan Province, College of Information Science and Engineering, Hunan University, Changsha Hunan, 410082, China
- College of Information Engineering, Changsha Medical University, Changsha Hunan, 410219, China
| | - Wen Zhu
- Key Laboratory for Embedded and Network Computing of Hunan Province, College of Information Science and Engineering, Hunan University, Changsha Hunan, 410082, China
| | - Bo Liao
- Key Laboratory for Embedded and Network Computing of Hunan Province, College of Information Science and Engineering, Hunan University, Changsha Hunan, 410082, China.
| | - Yu Duan
- Hunan University of Commerce, Changsha Hunan, 410205, China
| | - Min Chen
- Key Laboratory for Embedded and Network Computing of Hunan Province, College of Information Science and Engineering, Hunan University, Changsha Hunan, 410082, China
| | - Yi Chen
- College of Drug, Changsha Medical University, Changsha Hunan, 410219, China
| | - Jialiang Yang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, NY, 10029, USA
| |
Collapse
|
230
|
Zhang W, Chien J, Yong J, Kuang R. Network-based machine learning and graph theory algorithms for precision oncology. NPJ Precis Oncol 2017; 1:25. [PMID: 29872707 PMCID: PMC5871915 DOI: 10.1038/s41698-017-0029-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 06/28/2017] [Accepted: 06/29/2017] [Indexed: 01/07/2023] Open
Abstract
Network-based analytics plays an increasingly important role in precision oncology. Growing evidence in recent studies suggests that cancer can be better understood through mutated or dysregulated pathways or networks rather than individual mutations and that the efficacy of repositioned drugs can be inferred from disease modules in molecular networks. This article reviews network-based machine learning and graph theory algorithms for integrative analysis of personal genomic data and biomedical knowledge bases to identify tumor-specific molecular mechanisms, candidate targets and repositioned drugs for personalized treatment. The review focuses on the algorithmic design and mathematical formulation of these methods to facilitate applications and implementations of network-based analysis in the practice of precision oncology. We review the methods applied in three scenarios to integrate genomic data and network models in different analysis pipelines, and we examine three categories of network-based approaches for repositioning drugs in drug-disease-gene networks. In addition, we perform a comprehensive subnetwork/pathway analysis of mutations in 31 cancer genome projects in the Cancer Genome Atlas and present a detailed case study on ovarian cancer. Finally, we discuss interesting observations, potential pitfalls and future directions in network-based precision oncology.
Collapse
Affiliation(s)
- Wei Zhang
- 1Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN USA
| | - Jeremy Chien
- 2Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS USA
| | - Jeongsik Yong
- 3Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN USA
| | - Rui Kuang
- 1Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN USA
| |
Collapse
|
231
|
Genome-wide predicting disease-related protein complexes by walking on the heterogeneous network based on data integration and laplacian normalization. Comput Biol Chem 2017; 69:41-47. [DOI: 10.1016/j.compbiolchem.2017.04.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Revised: 04/08/2017] [Accepted: 04/12/2017] [Indexed: 11/20/2022]
|
232
|
Yu L, Su R, Wang B, Zhang L, Zou Y, Zhang J, Gao L. Prediction of Novel Drugs for Hepatocellular Carcinoma Based on Multi-Source Random Walk. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:966-977. [PMID: 27076463 DOI: 10.1109/tcbb.2016.2550453] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Computational approaches for predicting drug-disease associations by integrating gene expression and biological network provide great insights to the complex relationships among drugs, targets, disease genes, and diseases at a system level. Hepatocellular carcinoma (HCC) is one of the most common malignant tumors with a high rate of morbidity and mortality. We provide an integrative framework to predict novel d rugs for HCC based on multi-source random walk (PD-MRW). Firstly, based on gene expression and protein interaction network, we construct a gene-gene weighted i nteraction network (GWIN). Then, based on multi-source random walk in GWIN, we build a drug-drug similarity network. Finally, based on the known drugs for HCC, we score all drugs in the drug-drug similarity network. The robustness of our predictions, their overlap with those reported in Comparative Toxicogenomics Database (CTD) and literatures, and their enriched KEGG pathway demonstrate our approach can effectively identify new drug indications. Specifically, regorafenib (Rank = 9 in top-20 list) is proven to be effective in Phase I and II clinical trials of HCC, and the Phase III trial is ongoing. And, it has 11 overlapping pathways with HCC with lower p-values. Focusing on a particular disease, we believe our approach is more accurate and possesses better scalability.
Collapse
|
233
|
Liu Y, Zeng X, He Z, Zou Q. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:905-915. [PMID: 27076459 DOI: 10.1109/tcbb.2016.2550432] [Citation(s) in RCA: 209] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Since the discovery of the regulatory function of microRNA (miRNA), increased attention has focused on identifying the relationship between miRNA and disease. It has been suggested that computational method are an efficient way to identify potential disease-related miRNAs for further confirmation using biological experiments. In this paper, we first highlighted three limitations commonly associated with previous computational methods. To resolve these limitations, we established disease similarity subnetwork and miRNA similarity subnetwork by integrating multiple data sources, where the disease similarity is composed of disease semantic similarity and disease functional similarity, and the miRNA similarity is calculated using the miRNA-target gene and miRNA-lncRNA (long non-coding RNA) associations. Then, a heterogeneous network was constructed by connecting the disease similarity subnetwork and the miRNA similarity subnetwork using the known miRNA-disease associations. We extended random walk with restart to predict miRNA-disease associations in the heterogeneous network. The leave-one-out cross-validation achieved an average area under the curve (AUC) of 0:8049 across 341 diseases and 476 miRNAs. For five-fold cross-validation, our method achieved an AUC from 0:7970 to 0:9249 for 15 human diseases. Case studies further demonstrated the feasibility of our method to discover potential miRNA-disease associations. An online service for prediction is freely available at http://ifmda.aliapp.com.
Collapse
|
234
|
Le DH, Pham VH. HGPEC: a Cytoscape app for prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network. BMC SYSTEMS BIOLOGY 2017; 11:61. [PMID: 28619054 PMCID: PMC5472867 DOI: 10.1186/s12918-017-0437-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Accepted: 05/31/2017] [Indexed: 12/31/2022]
Abstract
Background Finding gene-disease and disease-disease associations play important roles in the biomedical area and many prioritization methods have been proposed for this goal. Among them, approaches based on a heterogeneous network of genes and diseases are considered state-of-the-art ones, which achieve high prediction performance and can be used for diseases with/without known molecular basis. Results Here, we developed a Cytoscape app, namely HGPEC, based on a random walk with restart algorithm on a heterogeneous network of genes and diseases. This app can prioritize candidate genes and diseases by employing a heterogeneous network consisting of a network of genes/proteins and a phenotypic disease similarity network. Based on the rankings, novel disease-gene and disease-disease associations can be identified. These associations can be supported with network- and rank-based visualization as well as evidences and annotations from biomedical data. A case study on prediction of novel breast cancer-associated genes and diseases shows the abilities of HGPEC. In addition, we showed prominence in the performance of HGPEC compared to other tools for prioritization of candidate disease genes. Conclusions Taken together, our app is expected to effectively predict novel disease-gene and disease-disease associations and support network- and rank-based visualization as well as biomedical evidences for such the associations. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0437-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Duc-Hau Le
- Vinmec Research Institute of Stem Cell and Gene Technology, 458 Minh Khai, Hai Ba Trung, Hanoi, Vietnam.,Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Vietnam
| | - Van-Huy Pham
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam.
| |
Collapse
|
235
|
Abstract
Biological networks are powerful resources for the discovery of genes and genetic modules that drive disease. Fundamental to network analysis is the concept that genes underlying the same phenotype tend to interact; this principle can be used to combine and to amplify signals from individual genes. Recently, numerous bioinformatic techniques have been proposed for genetic analysis using networks, based on random walks, information diffusion and electrical resistance. These approaches have been applied successfully to identify disease genes, genetic modules and drug targets. In fact, all these approaches are variations of a unifying mathematical machinery - network propagation - suggesting that it is a powerful data transformation method of broad utility in genetic research.
Collapse
|
236
|
A novel network regularized matrix decomposition method to detect mutated cancer genes in tumour samples with inter-patient heterogeneity. Sci Rep 2017; 7:2855. [PMID: 28588243 PMCID: PMC5460199 DOI: 10.1038/s41598-017-03141-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 04/20/2017] [Indexed: 01/01/2023] Open
Abstract
Inter-patient heterogeneity is a major challenge for mutated cancer genes detection which is crucial to advance cancer diagnostics and therapeutics. To detect mutated cancer genes in heterogeneous tumour samples, a prominent strategy is to determine whether the genes are recurrently mutated in their interaction network context. However, recent studies show that some cancer genes in different perturbed pathways are mutated in different subsets of samples. Subsequently, these genes may not display significant mutational recurrence and thus remain undiscovered even in consideration of network information. We develop a novel method called mCGfinder to efficiently detect mutated cancer genes in tumour samples with inter-patient heterogeneity. Based on matrix decomposition framework incorporated with gene interaction network information, mCGfinder can successfully measure the significance of mutational recurrence of genes in a subset of samples. When applying mCGfinder on TCGA somatic mutation datasets of five types of cancers, we find that the genes detected by mCGfinder are significantly enriched for known cancer genes, and yield substantially smaller p-values than other existing methods. All the results demonstrate that mCGfinder is an efficient method in detecting mutated cancer genes.
Collapse
|
237
|
Liu X, Zeng P, Cui Q, Zhou Y. Comparative analysis of genes frequently regulated by drugs based on connectivity map transcriptome data. PLoS One 2017; 12:e0179037. [PMID: 28575118 PMCID: PMC5456389 DOI: 10.1371/journal.pone.0179037] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 05/23/2017] [Indexed: 11/18/2022] Open
Abstract
Gene expression is perturbated by drugs to different extent. Analyzing genes whose expression is frequently regulated by drugs would be useful for the screening of candidate therapeutic targets and genes implicated in side effect. Here, we obtained the differential expression number (DEN) for genes profiled in Affymetrix microarrays from the Connectivity Map project, and conducted systemic comparative computational analysis between high DEN genes and other genes. Results indicated that genes with higher down-/up-regulation number (down_h/up_h) tended to be clustered in genome, and have lower homologous gene number, higher SNP density and more disease-related SNP. Down_h and up_h were significantly enriched in cancer related pathways, while genes with lower down-/up-regulation number (down_l/up_l) were mainly involved in the development of nervous system diseases. Besides, up_h had lower interaction network degree, later developmental stage to express, higher tissue expression specificity than up_l, while down_h showed reversed tendency in comparison with down_l. Together, our analysis suggests that genes frequently regulated by drugs are more likely to be associated with disease-related functions, but the extensive activation of conserved and widely expressed genes by drugs is disfavored.
Collapse
Affiliation(s)
- Xinhua Liu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Pan Zeng
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
- Centre for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
| | - Qinghua Cui
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
- Centre for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
- * E-mail: (QC); (YZ)
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
- Centre for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Haidian District, Beijing, China
- * E-mail: (QC); (YZ)
| |
Collapse
|
238
|
Peng C, Li A. A Heterogeneous Network Based Method for Identifying GBM-Related Genes by Integrating Multi-Dimensional Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:713-720. [PMID: 28113912 DOI: 10.1109/tcbb.2016.2555314] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .
Collapse
|
239
|
Lotfi Shahreza M, Ghadiri N, Mousavi SR, Varshosaz J, Green JR. Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioning. J Biomed Inform 2017; 68:167-183. [DOI: 10.1016/j.jbi.2017.03.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2016] [Revised: 02/09/2017] [Accepted: 03/10/2017] [Indexed: 12/14/2022]
|
240
|
Zhou X, Ding L, Li Z, Wan R. Collaborator recommendation in heterogeneous bibliographic networks using random walks. INFORM RETRIEVAL J 2017. [DOI: 10.1007/s10791-017-9300-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
241
|
Wen M, Zhang Z, Niu S, Sha H, Yang R, Yun Y, Lu H. Deep-Learning-Based Drug–Target Interaction Prediction. J Proteome Res 2017; 16:1401-1409. [DOI: 10.1021/acs.jproteome.6b00618] [Citation(s) in RCA: 279] [Impact Index Per Article: 39.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Ming Wen
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Zhimin Zhang
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Shaoyu Niu
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Haozhi Sha
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Ruihan Yang
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Yonghuan Yun
- Institute
of Environment and Plant Protection, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, PR China
| | - Hongmei Lu
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| |
Collapse
|
242
|
Zhang SN, Li XZ, Yang XY. Drug-likeness prediction of chemical constituents isolated from Chinese materia medica Ciwujia. JOURNAL OF ETHNOPHARMACOLOGY 2017; 198:131-138. [PMID: 28065780 DOI: 10.1016/j.jep.2017.01.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 12/10/2016] [Accepted: 01/04/2017] [Indexed: 06/06/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Ciwujia (CWJ), one of the most commonly used Chinese materia medicas (CMMs), is derived from the roots, rhizomes, and stems of Acanthopanax senticosus harms (AS). CWJ has been used for the treatment of various central nervous system (CNS) and peripheral system diseases. Drug-likeness prediction can help to analyze the absorption, distribution, metabolism, and excretion (ADME) processes of the compounds in CWJ, as well as their potential therapeutic and toxic effects, which is of significance in the confirmation of the active material bases of CWJ. MATERIALS AND METHODS The ADME properties of the compounds were calculated through web based PreADMET program and ACD/I-Lab 2.0. The potential therapeutic and toxicity targets of these compounds were screened by the ChemQuery tool in DrugBank and T3DB. RESULTS 14/39 compounds had moderate or good oral bioavailability (OB). 29/39 compounds bound weakly to the plasma proteins. 18/39 compounds might pass across the blood-brain barrier (BBB). Most of these compounds showed low renal excretion ability. 25/39 compounds had 99 structurally similar drugs and 158 potential therapeutic targets. Additionally, 17/39 compounds had 53 structurally similar toxins and 126 potential toxicity targets. CONCLUSION Our study suggests that these compounds have a certain drug-likeness potentials, which are also likely to be the material bases of CWJ. These results may provide a reference for the safe use of CWJ and the expansion of its application scope.
Collapse
Affiliation(s)
- Shuai-Nan Zhang
- Department of Pharmacy, Guiyang University of Chinese Medicine, Guiyang 550025, PR China
| | - Xu-Zhao Li
- Department of Pharmacy, Guiyang University of Chinese Medicine, Guiyang 550025, PR China.
| | - Xu-Yan Yang
- First Affiliated Hospital, Heilongjiang University of Chinese Medicine, Harbin 150040, PR China.
| |
Collapse
|
243
|
SELF-BLM: Prediction of drug-target interactions via self-training SVM. PLoS One 2017; 12:e0171839. [PMID: 28192537 PMCID: PMC5305209 DOI: 10.1371/journal.pone.0171839] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 01/26/2017] [Indexed: 01/08/2023] Open
Abstract
Predicting drug-target interactions is important for the development of novel drugs and the repositioning of drugs. To predict such interactions, there are a number of methods based on drug and target protein similarity. Although these methods, such as the bipartite local model (BLM), show promise, they often categorize unknown interactions as negative interaction. Therefore, these methods are not ideal for finding potential drug-target interactions that have not yet been validated as positive interactions. Thus, here we propose a method that integrates machine learning techniques, such as self-training support vector machine (SVM) and BLM, to develop a self-training bipartite local model (SELF-BLM) that facilitates the identification of potential interactions. The method first categorizes unlabeled interactions and negative interactions among unknown interactions using a clustering method. Then, using the BLM method and self-training SVM, the unlabeled interactions are self-trained and final local classification models are constructed. When applied to four classes of proteins that include enzymes, G-protein coupled receptors (GPCRs), ion channels, and nuclear receptors, SELF-BLM showed the best performance for predicting not only known interactions but also potential interactions in three protein classes compare to other related studies. The implemented software and supporting data are available at https://github.com/GIST-CSBL/SELF-BLM.
Collapse
|
244
|
Stanfield Z, Coşkun M, Koyutürk M. Drug Response Prediction as a Link Prediction Problem. Sci Rep 2017; 7:40321. [PMID: 28067293 PMCID: PMC5220354 DOI: 10.1038/srep40321] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 12/01/2016] [Indexed: 12/23/2022] Open
Abstract
Drug response prediction is a well-studied problem in which the molecular profile of a given sample is used to predict the effect of a given drug on that sample. Effective solutions to this problem hold the key for precision medicine. In cancer research, genomic data from cell lines are often utilized as features to develop machine learning models predictive of drug response. Molecular networks provide a functional context for the integration of genomic features, thereby resulting in robust and reproducible predictive models. However, inclusion of network data increases dimensionality and poses additional challenges for common machine learning tasks. To overcome these challenges, we here formulate drug response prediction as a link prediction problem. For this purpose, we represent drug response data for a large cohort of cell lines as a heterogeneous network. Using this network, we compute “network profiles” for cell lines and drugs. We then use the associations between these profiles to predict links between drugs and cell lines. Through leave-one-out cross validation and cross-classification on independent datasets, we show that this approach leads to accurate and reproducible classification of sensitive and resistant cell line-drug pairs, with 85% accuracy. We also examine the biological relevance of the network profiles.
Collapse
Affiliation(s)
- Zachary Stanfield
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Mustafa Coşkun
- Department of Electrical Engineering and Computer Science, Case School of Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Mehmet Koyutürk
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH, 44106, USA.,Department of Electrical Engineering and Computer Science, Case School of Engineering, Case Western Reserve University, Cleveland, OH, 44106, USA
| |
Collapse
|
245
|
Zhang W, Chen Y, Liu F, Luo F, Tian G, Li X. Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data. BMC Bioinformatics 2017; 18:18. [PMID: 28056782 PMCID: PMC5217341 DOI: 10.1186/s12859-016-1415-9] [Citation(s) in RCA: 140] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 12/09/2016] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Drug-drug interactions (DDIs) are one of the major concerns in drug discovery. Accurate prediction of potential DDIs can help to reduce unexpected interactions in the entire lifecycle of drugs, and are important for the drug safety surveillance. RESULTS Since many DDIs are not detected or observed in clinical trials, this work is aimed to predict unobserved or undetected DDIs. In this paper, we collect a variety of drug data that may influence drug-drug interactions, i.e., drug substructure data, drug target data, drug enzyme data, drug transporter data, drug pathway data, drug indication data, drug side effect data, drug off side effect data and known drug-drug interactions. We adopt three representative methods: the neighbor recommender method, the random walk method and the matrix perturbation method to build prediction models based on different data. Thus, we evaluate the usefulness of different information sources for the DDI prediction. Further, we present flexible frames of integrating different models with suitable ensemble rules, including weighted average ensemble rule and classifier ensemble rule, and develop ensemble models to achieve better performances. CONCLUSIONS The experiments demonstrate that different data sources provide diverse information, and the DDI network based on known DDIs is one of most important information for DDI prediction. The ensemble methods can produce better performances than individual methods, and outperform existing state-of-the-art methods. The datasets and source codes are available at https://github.com/zw9977129/drug-drug-interaction/ .
Collapse
Affiliation(s)
- Wen Zhang
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072, China. .,School of Computer, Wuhan University, Wuhan, 430072, China.
| | - Yanlin Chen
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Feng Liu
- International School of software, Wuhan University, Wuhan, 430072, China
| | - Fei Luo
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072, China.,School of Computer, Wuhan University, Wuhan, 430072, China
| | - Gang Tian
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072, China.,School of Computer, Wuhan University, Wuhan, 430072, China
| | - Xiaohong Li
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072, China.,School of Computer, Wuhan University, Wuhan, 430072, China
| |
Collapse
|
246
|
Chen FS, Jiang HY, Jiang Z. Prediction of drug–pathway interaction pairs with a disease-combined LSA-PU-KNN method. MOLECULAR BIOSYSTEMS 2017; 13:2583-2591. [DOI: 10.1039/c7mb00441a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
This paper proposes a prediction of potential associations between drugs and pathways based on a disease-related LSA-PU-KNN method.
Collapse
Affiliation(s)
- Fan-Shu Chen
- Shanghai Key Laboratory of Multidimensional Information Processing
- East China Normal University
- Shanghai 200262
- China
- Department of Computer Science and Technology
| | - Hui-Yan Jiang
- Shanghai Key Laboratory of Multidimensional Information Processing
- East China Normal University
- Shanghai 200262
- China
- Department of Computer Science and Technology
| | - Zhenran Jiang
- Shanghai Key Laboratory of Multidimensional Information Processing
- East China Normal University
- Shanghai 200262
- China
- Department of Computer Science and Technology
| |
Collapse
|
247
|
Liu H, Song Y, Guan J, Luo L, Zhuang Z. Inferring new indications for approved drugs via random walk on drug-disease heterogenous networks. BMC Bioinformatics 2016; 17:539. [PMID: 28155639 PMCID: PMC5259862 DOI: 10.1186/s12859-016-1336-7] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Background Since traditional drug research and development is often time-consuming and high-risk, there is an increasing interest in establishing new medical indications for approved drugs, referred to as drug repositioning, which provides a relatively low-cost and high-efficiency approach for drug discovery. With the explosive growth of large-scale biochemical and phenotypic data, drug repositioning holds great potential for precision medicine in the post-genomic era. It is urgent to develop rational and systematic approaches to predict new indications for approved drugs on a large scale. Results In this paper, we propose the two-pass random walks with restart on a heterogenous network, TP-NRWRH for short, to predict new indications for approved drugs. Rather than random walk on bipartite network, we integrated the drug-drug similarity network, disease-disease similarity network and known drug-disease association network into one heterogenous network, on which the two-pass random walks with restart is implemented. We have conducted performance evaluation on two datasets of drug-disease associations, and the results show that our method has higher performance than six existing methods. A case study on the Alzheimer’s disease showed that nine of top 10 predicted drugs have been approved or investigational for neurodegenerative diseases. The experimental results show that our method achieves state-of-the-art performance in predicting new indications for approved drugs. Conclusions We proposed a two-pass random walk with restart on the drug-disease heterogeneous network, referred to as TP-NRWRH, to predict new indications for approved drugs. Performance evaluation on two independent datasets showed that TP-NRWRH achieved higher performance than six existing methods on 10-fold cross validations. The case study on the Alzheimer’s disease showed that nine of top 10 predicted drugs have been approved or are investigational for neurodegenerative diseases. The results show that our method achieves state-of-the-art performance in predicting new indications for approved drugs.
Collapse
Affiliation(s)
- Hui Liu
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China.,Changzhou University, Jiangsu, 213164, China
| | - Yinglong Song
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, 200433, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai, 201804, China
| | - Libo Luo
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China.
| | - Ziheng Zhuang
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China. .,Changzhou University, Jiangsu, 213164, China.
| |
Collapse
|
248
|
Ezzat A, Wu M, Li XL, Kwoh CK. Drug-target interaction prediction via class imbalance-aware ensemble learning. BMC Bioinformatics 2016; 17:509. [PMID: 28155697 PMCID: PMC5259867 DOI: 10.1186/s12859-016-1377-y] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types. RESULTS We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was able to predict many of the interactions successfully. CONCLUSIONS Our proposed method has improved the prediction performance over the existing work, thus proving the importance of addressing problems pertaining to class imbalance in the data.
Collapse
Affiliation(s)
- Ali Ezzat
- School of Computer Science & Engineering, Nanyang Technological University, Nanyang Ave., Singapore, 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research (I2R), A*Star, Fusionopolis Way, Singapore, 138632, Singapore
| | - Xiao-Li Li
- Institute for Infocomm Research (I2R), A*Star, Fusionopolis Way, Singapore, 138632, Singapore.
| | - Chee-Keong Kwoh
- School of Computer Science & Engineering, Nanyang Technological University, Nanyang Ave., Singapore, 639798, Singapore
| |
Collapse
|
249
|
Lim H, Gray P, Xie L, Poleksic A. Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem. Sci Rep 2016; 6:38860. [PMID: 27958331 PMCID: PMC5153628 DOI: 10.1038/srep38860] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Accepted: 11/15/2016] [Indexed: 12/18/2022] Open
Abstract
Conventional one-drug-one-gene approach has been of limited success in modern drug discovery. Polypharmacology, which focuses on searching for multi-targeted drugs to perturb disease-causing networks instead of designing selective ligands to target individual proteins, has emerged as a new drug discovery paradigm. Although many methods for single-target virtual screening have been developed to improve the efficiency of drug discovery, few of these algorithms are designed for polypharmacology. Here, we present a novel theoretical framework and a corresponding algorithm for genome-scale multi-target virtual screening based on the one-class collaborative filtering technique. Our method overcomes the sparseness of the protein-chemical interaction data by means of interaction matrix weighting and dual regularization from both chemicals and proteins. While the statistical foundation behind our method is general enough to encompass genome-wide drug off-target prediction, the program is specifically tailored to find protein targets for new chemicals with little to no available interaction data. We extensively evaluate our method using a number of the most widely accepted gene-specific and cross-gene family benchmarks and demonstrate that our method outperforms other state-of-the-art algorithms for predicting the interaction of new chemicals with multiple proteins. Thus, the proposed algorithm may provide a powerful tool for multi-target drug design.
Collapse
Affiliation(s)
- Hansaim Lim
- Department of Computer Science, Hunter College, The City University of New York, New York, New York 10065, United States
| | - Paul Gray
- Department of Computer Science, University of Northern Iowa, Cedar Falls, Iowa 50614, United States
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, New York 10065, United States.,Ph.D. Program in Computer Science, Biochemistry and Biology, The Graduate Center, The City University of New York, New York, New York 10065, United States
| | - Aleksandar Poleksic
- Department of Computer Science, University of Northern Iowa, Cedar Falls, Iowa 50614, United States
| |
Collapse
|
250
|
Computational Discovery of Putative Leads for Drug Repositioning through Drug-Target Interaction Prediction. PLoS Comput Biol 2016; 12:e1005219. [PMID: 27893735 PMCID: PMC5125559 DOI: 10.1371/journal.pcbi.1005219] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 10/21/2016] [Indexed: 12/23/2022] Open
Abstract
De novo experimental drug discovery is an expensive and time-consuming task. It requires the identification of drug-target interactions (DTIs) towards targets of biological interest, either to inhibit or enhance a specific molecular function. Dedicated computational models for protein simulation and DTI prediction are crucial for speed and to reduce the costs associated with DTI identification. In this paper we present a computational pipeline that enables the discovery of putative leads for drug repositioning that can be applied to any microbial proteome, as long as the interactome of interest is at least partially known. Network metrics calculated for the interactome of the bacterial organism of interest were used to identify putative drug-targets. Then, a random forest classification model for DTI prediction was constructed using known DTI data from publicly available databases, resulting in an area under the ROC curve of 0.91 for classification of out-of-sampling data. A drug-target network was created by combining 3,081 unique ligands and the expected ten best drug targets. This network was used to predict new DTIs and to calculate the probability of the positive class, allowing the scoring of the predicted instances. Molecular docking experiments were performed on the best scoring DTI pairs and the results were compared with those of the same ligands with their original targets. The results obtained suggest that the proposed pipeline can be used in the identification of new leads for drug repositioning. The proposed classification model is available at http://bioinformatics.ua.pt/software/dtipred/. The emergence of multi-resistant bacterial strains and the existing void in the discovery and development of new classes of antibiotics is a growing concern. Indeed, some bacterial strains are now resistant to last-line antibiotics and considered untreatable. Drug repositioning has been suggested as a strategy to minimize time and cost expenses until the drug reaches the market, compared to traditional drug design. Drug-target interactions (DTIs) are the basis of rational drug design and thus, we proposed a computational approach to predict DTIs solely based on the primary sequence of the protein and the simplified molecular-input line-entry system of the ligand. In addition, network metrics are used to identify vital putative drug-targets in bacteria. Molecular docking experiments were performed to compare the binding affinities between a given ligand and a putative drug-target, as well as with their original targets. According to the docking results, the predicted DTIs have better or similar binding activities than the ligand and their real target, indicating the validity of the proposed model.
Collapse
|