51
|
Yamanishi Y, Kotera M, Moriya Y, Sawada R, Kanehisa M, Goto S. DINIES: drug-target interaction network inference engine based on supervised analysis. Nucleic Acids Res 2014; 42:W39-45. [PMID: 24838565 PMCID: PMC4086078 DOI: 10.1093/nar/gku337] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
DINIES (drug–target interaction network inference engine based on supervised analysis) is a web server for predicting unknown drug–target interaction networks from various types of biological data (e.g. chemical structures, drug side effects, amino acid sequences and protein domains) in the framework of supervised network inference. The originality of DINIES lies in prediction with state-of-the-art machine learning methods, in the integration of heterogeneous biological data and in compatibility with the KEGG database. The DINIES server accepts any ‘profiles’ or precalculated similarity matrices (or ‘kernels’) of drugs and target proteins in tab-delimited file format. When a training data set is submitted to learn a predictive model, users can select either known interaction information in the KEGG DRUG database or their own interaction data. The user can also select an algorithm for supervised network inference, select various parameters in the method and specify weights for heterogeneous data integration. The server can provide integrative analyses with useful components in KEGG, such as biological pathways, functional hierarchy and human diseases. DINIES (http://www.genome.jp/tools/dinies/) is publicly available as one of the genome analysis tools in GenomeNet.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Division of System Cohort, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan Institute for Advanced Study, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581, Japan
| | - Masaaki Kotera
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Yuki Moriya
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Ryusuke Sawada
- Division of System Cohort, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Minoru Kanehisa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Susumu Goto
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| |
Collapse
|
52
|
Tabei Y, Yamanishi Y. Scalable prediction of compound-protein interactions using minwise hashing. BMC SYSTEMS BIOLOGY 2013; 7 Suppl 6:S3. [PMID: 24564870 PMCID: PMC4029277 DOI: 10.1186/1752-0509-7-s6-s3] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The identification of compound-protein interactions plays key roles in the drug development toward discovery of new drug leads and new therapeutic protein targets. There is therefore a strong incentive to develop new efficient methods for predicting compound-protein interactions on a genome-wide scale. In this paper we develop a novel chemogenomic method to make a scalable prediction of compound-protein interactions from heterogeneous biological data using minwise hashing. The proposed method mainly consists of two steps: 1) construction of new compact fingerprints for compound-protein pairs by an improved minwise hashing algorithm, and 2) application of a sparsity-induced classifier to the compact fingerprints. We test the proposed method on its ability to make a large-scale prediction of compound-protein interactions from compound substructure fingerprints and protein domain fingerprints, and show superior performance of the proposed method compared with the previous chemogenomic methods in terms of prediction accuracy, computational efficiency, and interpretability of the predictive model. All the previously developed methods are not computationally feasible for the full dataset consisting of about 200 millions of compound-protein pairs. The proposed method is expected to be useful for virtual screening of a huge number of compounds against many protein targets.
Collapse
|
53
|
Ding H, Takigawa I, Mamitsuka H, Zhu S. Similarity-based machine learning methods for predicting drug–target interactions: a brief review. Brief Bioinform 2013; 15:734-47. [DOI: 10.1093/bib/bbt056] [Citation(s) in RCA: 261] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
|
54
|
Learning a peptide-protein binding affinity predictor with kernel ridge regression. BMC Bioinformatics 2013; 14:82. [PMID: 23497081 PMCID: PMC3651388 DOI: 10.1186/1471-2105-14-82] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2012] [Accepted: 02/21/2013] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND The cellular function of a vast majority of proteins is performed through physical interactions with other biomolecules, which, most of the time, are other proteins. Peptides represent templates of choice for mimicking a secondary structure in order to modulate protein-protein interaction. They are thus an interesting class of therapeutics since they also display strong activity, high selectivity, low toxicity and few drug-drug interactions. Furthermore, predicting peptides that would bind to a specific MHC alleles would be of tremendous benefit to improve vaccine based therapy and possibly generate antibodies with greater affinity. Modern computational methods have the potential to accelerate and lower the cost of drug and vaccine discovery by selecting potential compounds for testing in silico prior to biological validation. RESULTS We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalizes eight kernels, comprised of the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of predicting the binding affinity of any peptide to any protein with reasonable accuracy. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. CONCLUSION On all benchmarks, our method significantly (p-value ≤ 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. Moreover, generating reliable peptide-protein binding affinities will also improve system biology modelling of interaction pathways. Lastly, the method should be of value to a large segment of the research community with the potential to accelerate the discovery of peptide-based drugs and facilitate vaccine development. The proposed kernel is freely available at http://graal.ift.ulaval.ca/downloads/gs-kernel/.
Collapse
|
55
|
Abstract
The identification of drug-target interactions from heterogeneous biological data is critical in the drug development. In this chapter, we review recently developed in silico chemogenomic approaches to infer unknown drug-target interactions from chemical information of drugs and genomic information of target proteins. We review several kernel-based statistical methods from two different viewpoints: binary classification and dimension reduction. In the results, we demonstrate the usefulness of the methods on the prediction of drug-target interactions from chemical structure data and genomic sequence data. We also discuss the characteristics of each method, and show some perspectives toward future research direction.
Collapse
|
56
|
Nakamura M, Hachiya T, Saito Y, Sato K, Sakakibara Y. An efficient algorithm for de novo predictions of biochemical pathways between chemical compounds. BMC Bioinformatics 2012; 13 Suppl 17:S8. [PMID: 23282285 PMCID: PMC3521390 DOI: 10.1186/1471-2105-13-s17-s8] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Background Prediction of biochemical (metabolic) pathways has a wide range of applications, including the optimization of drug candidates, and the elucidation of toxicity mechanisms. Recently, several methods have been developed for pathway prediction to derive a goal compound from a start compound. However, these methods require high computational costs, and cannot perform comprehensive prediction of novel metabolic pathways. Our aim of this study is to develop a de novo prediction method for reconstructions of metabolic pathways and predictions of unknown biosynthetic pathways in the sense that it does not require any initial network such as KEGG metabolic network to be explored. Results We formulated pathway prediction between a start compound and a goal compound as the shortest path search problem in terms of the number of enzyme reactions applied. We propose an efficient search method based on A* algorithm and heuristic techniques utilizing Linear Programming (LP) solution for estimation of the distance to the goal. First, a chemical compound is represented by a feature vector which counts frequencies of substructure occurrences in the structural formula. Second, an enzyme reaction is represented as an operator vector by detecting the structural changes to compounds before and after the reaction. By defining compound vectors as nodes and operator vectors as edges, prediction of the reaction pathway is reduced to the shortest path search problem in the vector space. In experiments on the DDT degradation pathway, we verify that the shortest paths predicted by our method are biologically correct pathways registered in the KEGG database. The results also demonstrate that the LP heuristics can achieve significant reduction in computation time. Furthermore, we apply our method to a secondary metabolite pathway of plant origin, and successfully find a novel biochemical pathway which cannot be predicted by the existing method. For the reconstruction of a known biochemical pathway, our method is over 40 times as fast as the existing method. Conclusions Our method enables fast and accurate de novo pathway predictions and novel pathway detection.
Collapse
Affiliation(s)
- Masaomi Nakamura
- Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Yokohama 223-8522, Japan
| | | | | | | | | |
Collapse
|
57
|
Yu H, Chen J, Xu X, Li Y, Zhao H, Fang Y, Li X, Zhou W, Wang W, Wang Y. A systematic prediction of multiple drug-target interactions from chemical, genomic, and pharmacological data. PLoS One 2012; 7:e37608. [PMID: 22666371 PMCID: PMC3364341 DOI: 10.1371/journal.pone.0037608] [Citation(s) in RCA: 263] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Accepted: 04/23/2012] [Indexed: 02/07/2023] Open
Abstract
In silico prediction of drug-target interactions from heterogeneous biological data can advance our system-level search for drug molecules and therapeutic targets, which efforts have not yet reached full fruition. In this work, we report a systematic approach that efficiently integrates the chemical, genomic, and pharmacological information for drug targeting and discovery on a large scale, based on two powerful methods of Random Forest (RF) and Support Vector Machine (SVM). The performance of the derived models was evaluated and verified with internally five-fold cross-validation and four external independent validations. The optimal models show impressive performance of prediction for drug-target interactions, with a concordance of 82.83%, a sensitivity of 81.33%, and a specificity of 93.62%, respectively. The consistence of the performances of the RF and SVM models demonstrates the reliability and robustness of the obtained models. In addition, the validated models were employed to systematically predict known/unknown drugs and targets involving the enzymes, ion channels, GPCRs, and nuclear receptors, which can be further mapped to functional ontologies such as target-disease associations and target-target interaction networks. This approach is expected to help fill the existing gap between chemical genomics and network pharmacology and thus accelerate the drug discovery processes.
Collapse
Affiliation(s)
- Hua Yu
- Bioinformatics Center, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
58
|
Kobayashi H, Harada H, Nakamura M, Futamura Y, Ito A, Yoshida M, Iemura SI, Shin-Ya K, Doi T, Takahashi T, Natsume T, Imoto M, Sakakibara Y. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications. BMC CHEMICAL BIOLOGY 2012; 12:2. [PMID: 22480302 PMCID: PMC3471015 DOI: 10.1186/1472-6769-12-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Accepted: 04/05/2012] [Indexed: 12/14/2022]
Abstract
Background Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. Results We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system. As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. Conclusions This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.
Collapse
Affiliation(s)
- Hiroki Kobayashi
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Hiroko Harada
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Masaomi Nakamura
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Yushi Futamura
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Akihiro Ito
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, 2-1 Hirosawa, Wako-shi, Saitama, 351-0198, Japan
| | - Minoru Yoshida
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, 2-1 Hirosawa, Wako-shi, Saitama, 351-0198, Japan
| | - Shun-Ichiro Iemura
- National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Kazuo Shin-Ya
- National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Takayuki Doi
- Graduate School of Pharmaceutical Sciences, Tohoku University, 6-3 Aza-Aoba, Aramaki, Aoba, Sendai, 980-8578, Japan
| | - Takashi Takahashi
- Department of Applied Chemistry, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552, Japan
| | - Tohru Natsume
- National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Masaya Imoto
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Yasubumi Sakakibara
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| |
Collapse
|
59
|
Sakakibara Y, Hachiya T, Uchida M, Nagamine N, Sugawara Y, Yokota M, Nakamura M, Popendorf K, Komori T, Sato K. COPICAT: a software system for predicting interactions between proteins and chemical compounds. Bioinformatics 2012; 28:745-6. [PMID: 22257668 DOI: 10.1093/bioinformatics/bts031] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED Since tens of millions of chemical compounds have been accumulated in public chemical databases, fast comprehensive computational methods to predict interactions between chemical compounds and proteins are needed for virtual screening of lead compounds. Previously, we proposed a novel method for predicting protein-chemical interactions using two-layer Support Vector Machine classifiers that require only readily available biochemical data, i.e. amino acid sequences of proteins and structure formulas of chemical compounds. In this article, the method has been implemented as the COPICAT web service, with an easy-to-use front-end interface. Users can simply submit a protein-chemical interaction prediction job using a pre-trained classifier, or can even train their own classification model by uploading training data. COPICAT's fast and accurate computational prediction has enhanced lead compound discovery against a database of tens of millions of chemical compounds, implying that the search space for drug discovery is extended by >1000 times compared with currently well-used high-throughput screening methodologies. AVAILABILITY The COPICAT server is available at http://copicat.dna.bio.keio.ac.jp. All functions, including the prediction function are freely available via anonymous login without registration. Registered users, however, can use the system more intensively.
Collapse
Affiliation(s)
- Yasubumi Sakakibara
- Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Yokohama 223-8522, Japan.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Yamanishi Y, Kashima H. Prediction of Compound-protein Interactions with Machine Learning Methods. Mach Learn 2012. [DOI: 10.4018/978-1-60960-818-7.ch315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.
Collapse
|
61
|
Iacucci E, Ojeda F, De Moor B, Moreau Y. Predicting receptor-ligand pairs through kernel learning. BMC Bioinformatics 2011; 12:336. [PMID: 21834994 PMCID: PMC3199765 DOI: 10.1186/1471-2105-12-336] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 08/11/2011] [Indexed: 12/17/2022] Open
Abstract
Background Regulation of cellular events is, often, initiated via extracellular signaling. Extracellular signaling occurs when a circulating ligand interacts with one or more membrane-bound receptors. Identification of receptor-ligand pairs is thus an important and specific form of PPI prediction. Results Given a set of disparate data sources (expression data, domain content, and phylogenetic profile) we seek to predict new receptor-ligand pairs. We create a combined kernel classifier and assess its performance with respect to the Database of Ligand-Receptor Partners (DLRP) 'golden standard' as well as the method proposed by Gertz et al. Among our findings, we discover that our predictions for the tgfβ family accurately reconstruct over 76% of the supported edges (0.76 recall and 0.67 precision) of the receptor-ligand bipartite graph defined by the DLRP "golden standard". In addition, for the tgfβ family, the combined kernel classifier is able to relatively improve upon the Gertz et al. work by a factor of approximately 1.5 when considering that our method has an F-measure of 0.71 while that of Gertz et al. has a value of 0.48. Conclusions The prediction of receptor-ligand pairings is a difficult and complex task. We have demonstrated that using kernel learning on multiple data sources provides a stronger alternative to the existing method in solving this task.
Collapse
Affiliation(s)
- Ernesto Iacucci
- SCD-ESAT, Department of Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Leuven 3001, Belgium
| | | | | | | |
Collapse
|
62
|
Yamanishi Y, Pauwels E, Saigo H, Stoven V. Extracting sets of chemical substructures and protein domains governing drug-target interactions. J Chem Inf Model 2011; 51:1183-94. [PMID: 21506615 DOI: 10.1021/ci100476q] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The identification of rules governing molecular recognition between drug chemical substructures and protein functional sites is a challenging issue at many stages of the drug development process. In this paper we develop a novel method to extract sets of drug chemical substructures and protein domains that govern drug-target interactions on a genome-wide scale. This is made possible using sparse canonical correspondence analysis (SCCA) for analyzing drug substructure profiles and protein domain profiles simultaneously. The method does not depend on the availability of protein 3D structures. From a data set of known drug-target interactions including enzymes, ion channels, G protein-coupled receptors, and nuclear receptors, we extract a set of chemical substructures shared by drugs able to bind to a set of protein domains. These two sets of extracted chemical substructures and protein domains form components that can be further exploited in a drug discovery process. This approach successfully clusters protein domains that may be evolutionary unrelated but that bind a common set of chemical substructures. As shown in several examples, it can also be very helpful for predicting new protein-ligand interactions and addressing the problem of ligand specificity. The proposed method constitutes a contribution to the recent field of chemogenomics that aims to connect the chemical space with the biological space.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Mines ParisTech , Centre for Computational Biology, 35 rue Saint-Honore, F-77305 Fontainebleau Cedex, France, Institut Curie, F-75248, Paris, France, and INSERM U900, F-75248 Paris, France
| | | | | | | |
Collapse
|
63
|
Niijima S, Yabuuchi H, Okuno Y. Cross-Target View to Feature Selection: Identification of Molecular Interaction Features in Ligand−Target Space. J Chem Inf Model 2010; 51:15-24. [DOI: 10.1021/ci1001394] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Satoshi Niijima
- Department of Systems Bioscience for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Hiroaki Yabuuchi
- Department of Systems Bioscience for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Yasushi Okuno
- Department of Systems Bioscience for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| |
Collapse
|
64
|
Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 2010; 26:i246-54. [PMID: 20529913 PMCID: PMC2881361 DOI: 10.1093/bioinformatics/btq176] [Citation(s) in RCA: 288] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION In silico prediction of drug-target interactions from heterogeneous biological data is critical in the search for drugs and therapeutic targets for known diseases such as cancers. There is therefore a strong incentive to develop new methods capable of detecting these potential drug-target interactions efficiently. RESULTS In this article, we investigate the relationship between the chemical space, the pharmacological space and the topology of drug-target interaction networks, and show that drug-target interactions are more correlated with pharmacological effect similarity than with chemical structure similarity. We then develop a new method to predict unknown drug-target interactions from chemical, genomic and pharmacological data on a large scale. The proposed method consists of two steps: (i) prediction of pharmacological effects from chemical structures of given compounds and (ii) inference of unknown drug-target interactions based on the pharmacological effect similarity in the framework of supervised bipartite graph inference. The originality of the proposed method lies in the prediction of potential pharmacological similarity for any drug candidate compounds and in the integration of chemical, genomic and pharmacological data in a unified framework. In the results, we make predictions for four classes of important drug-target interactions involving enzymes, ion channels, GPCRs and nuclear receptors. Our comprehensively predicted drug-target interaction networks enable us to suggest many potential drug-target interactions and to increase research productivity toward genomic drug discovery. SUPPLEMENTARY INFORMATION Datasets and all prediction results are available at http://cbio.ensmp.fr/~yyamanishi/pharmaco/. AVAILABILITY Softwares are available upon request.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Mines ParisTech, Centre for Computational Biology, 35 rue Saint-Honore, F-77305 Fontainebleau Cedex, Institut Curie, F-75248, INSERM U900, F-75248, Paris, France.
| | | | | | | |
Collapse
|
65
|
Li L, Zhou X, Ching WK, Wang P. Predicting enzyme targets for cancer drugs by profiling human metabolic reactions in NCI-60 cell lines. BMC Bioinformatics 2010; 11:501. [PMID: 20932284 PMCID: PMC2964682 DOI: 10.1186/1471-2105-11-501] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2010] [Accepted: 10/08/2010] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Drugs can influence the whole metabolic system by targeting enzymes which catalyze metabolic reactions. The existence of interactions between drugs and metabolic reactions suggests a potential way to discover drug targets. RESULTS In this paper, we present a computational method to predict new targets for approved anti-cancer drugs by exploring drug-reaction interactions. We construct a Drug-Reaction Network to provide a global view of drug-reaction interactions and drug-pathway interactions. The recent reconstruction of the human metabolic network and development of flux analysis approaches make it possible to predict each metabolic reaction's cell line-specific flux state based on the cell line-specific gene expressions. We first profile each reaction by its flux states in NCI-60 cancer cell lines, and then propose a kernel k-nearest neighbor model to predict related metabolic reactions and enzyme targets for approved cancer drugs. We also integrate the target structure data with reaction flux profiles to predict drug targets and the area under curves can reach 0.92. CONCLUSIONS The cross validations using the methods with and without metabolic network indicate that the former method is significantly better than the latter. Further experiments show the synergism of reaction flux profiles and target structure for drug target prediction. It also implies the significant contribution of metabolic network to predict drug targets. Finally, we apply our method to predict new reactions and possible enzyme targets for cancer drugs.
Collapse
Affiliation(s)
- Limin Li
- Institute of Information and System Science, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xiaobo Zhou
- Center for Biotechnology and Informatics, The Methodist Hospital Research Institute and Department of Radiology, The Methodist Hospital, Weill Cornell Medical College, Houston, TX 77030, USA
| | - Wai-Ki Ching
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong
| | - Ping Wang
- The Methodist Hospital Research Institute and Department of Pathology, The Methodist Hospital, Weill Cornell Medical College, Houston, TX 77030, USA
| |
Collapse
|
66
|
He Z, Zhang J, Shi XH, Hu LL, Kong X, Cai YD, Chou KC. Predicting drug-target interaction networks based on functional groups and biological features. PLoS One 2010; 5:e9603. [PMID: 20300175 PMCID: PMC2836373 DOI: 10.1371/journal.pone.0009603] [Citation(s) in RCA: 189] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2009] [Accepted: 02/16/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Study of drug-target interaction networks is an important topic for drug development. It is both time-consuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. As a complement, the in silico prediction methods can provide us with very useful information in a timely manner. METHODS/PRINCIPAL FINDINGS To realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. The optimal feature selection procedures are adopted by means of the mRMR (Maximum Relevance Minimum Redundancy) method. Instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, G-protein- coupled receptors and nuclear receptors. Thus, four independent predictors are established using the Nearest Neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. As a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. CONCLUSION/SIGNIFICANCE Our results indicate that the network prediction system thus established is quite promising and encouraging.
Collapse
Affiliation(s)
- Zhisong He
- CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai, China
- Centre for Computational Systems Biology, Fudan University, Shanghai, China
| | - Jian Zhang
- Department of Ophthalmology, Yangpu District Central Hospital, Shanghai, China
| | - Xiao-He Shi
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS) and Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai, China
| | - Le-Le Hu
- Institute of System Biology, Shanghai University, Shanghai, China
| | - Xiangyin Kong
- Institute of Health Sciences, Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS) and Shanghai Jiao Tong University School of Medicine (SJTUSM), Shanghai, China
- State Key Laboratory of Medical Genomics, Ruijin Hospital, Shanghai Jiaotong University, Shanghai, China
- * E-mail: (XK); (YDC)
| | - Yu-Dong Cai
- Institute of System Biology, Shanghai University, Shanghai, China
- Gordon Life Science Institute, San Diego, California, United States of America
- * E-mail: (XK); (YDC)
| | - Kuo-Chen Chou
- Gordon Life Science Institute, San Diego, California, United States of America
| |
Collapse
|
67
|
Abstract
BACKGROUND One of the most recent and important developments in drug discovery is a new drug development approach of building and analyzing networks that contain relationships among drugs and targets, diseases, genes and other components. These networks and their integrations provide useful information for finding new targets as well as new drugs. OBJECTIVE This review article aims to review recent developments in various types of networks and suggest the future direction of these network studies for drug discovery. METHODS Databases and networks are integrated into a more complete network to better present the relationships among drugs, targets, genes, phenotypes and diseases. After discussing the limitations and obstacles of the recent research, we suggest several strategies to build a successful and practical drug-target network. RESULTS/CONCLUSION A useful, integrated network can be built from various databases and networks by resolving several issues, such as limited coverage and inconsistency. This integrated network can be completed by the prediction of missing links, biological network comparison and drug target identification. Possible applications are multi-target drug development, drug repurposing, estimation of drug effect on target perturbations in the whole system and extraction of the suitable purpose of the drug-target sub-network.
Collapse
Affiliation(s)
- Soyoung Lee
- KAIST, Department of Bio and Brain Engineering, 335 Gwahak-ro, Yuseong-gu, Daejeon, 305-701 Korea, Republic of Korea +82 42 350 4317 ; +82 42 350 4310 ;
| | | | | |
Collapse
|
68
|
Kashima H, Yamanishi Y, Kato T, Sugiyama M, Tsuda K. Simultaneous inference of biological networks of multiple species from genome-wide data and evolutionary information: a semi-supervised approach. Bioinformatics 2009; 25:2962-8. [PMID: 19689962 DOI: 10.1093/bioinformatics/btp494] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The existing supervised methods for biological network inference work on each of the networks individually based only on intra-species information such as gene expression data. We believe that it will be more effective to use genomic data and cross-species evolutionary information from different species simultaneously, rather than to use the genomic data alone. RESULTS We created a new semi-supervised learning method called Link Propagation for inferring biological networks of multiple species based on genome-wide data and evolutionary information. The new method was applied to simultaneous reconstruction of three metabolic networks of Caenorhabditis elegans, Helicobacter pylori and Saccharomyces cerevisiae, based on gene expression similarities and amino acid sequence similarities. The experimental results proved that the new simultaneous network inference method consistently improves the predictive performance over the individual network inferences, and it also outperforms in accuracy and speed other established methods such as the pairwise support vector machine. AVAILABILITY The software and data are available at http://cbio.ensmp.fr/~yyamanishi/LinkPropagation/.
Collapse
Affiliation(s)
- Hisashi Kashima
- IBM Research, Tokyo Research Laboratory, 1623-14 Shimo-tsuruma, Yamato, Kanagawa 242-8502, Japan.
| | | | | | | | | |
Collapse
|
69
|
Bleakley K, Yamanishi Y. Supervised prediction of drug-target interactions using bipartite local models. ACTA ACUST UNITED AC 2009; 25:2397-403. [PMID: 19605421 PMCID: PMC2735674 DOI: 10.1093/bioinformatics/btp433] [Citation(s) in RCA: 368] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Motivation:In silico prediction of drug–target interactions from heterogeneous biological data is critical in the search for drugs for known diseases. This problem is currently being attacked from many different points of view, a strong indication of its current importance. Precisely, being able to predict new drug–target interactions with both high precision and accuracy is the holy grail, a fundamental requirement for in silico methods to be useful in a biological setting. This, however, remains extremely challenging due to, amongst other things, the rarity of known drug–target interactions. Results: We propose a novel supervised inference method to predict unknown drug–target interactions, represented as a bipartite graph. We use this method, known as bipartite local models to first predict target proteins of a given drug, then to predict drugs targeting a given protein. This gives two independent predictions for each putative drug–target interaction, which we show can be combined to give a definitive prediction for each interaction. We demonstrate the excellent performance of the proposed method in the prediction of four classes of drug–target interaction networks involving enzymes, ion channels, G protein-coupled receptors (GPCRs) and nuclear receptors in human. This enables us to suggest a number of new potential drug–target interactions. Availability: An implementation of the proposed algorithm is available upon request from the authors. Datasets and all prediction results are available at http://cbio.ensmp.fr/~yyamanishi/bipartitelocal/. Contact:kevbleakley@gmail.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kevin Bleakley
- Mines ParisTech, Centre for Computational Biology, Fontainebleau, France.
| | | |
Collapse
|
70
|
Nagamine N, Shirakawa T, Minato Y, Torii K, Kobayashi H, Imoto M, Sakakibara Y. Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening. PLoS Comput Biol 2009; 5:e1000397. [PMID: 19503826 PMCID: PMC2685987 DOI: 10.1371/journal.pcbi.1000397] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2009] [Accepted: 04/30/2009] [Indexed: 02/06/2023] Open
Abstract
Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.
Collapse
Affiliation(s)
- Nobuyoshi Nagamine
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Takayuki Shirakawa
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Yusuke Minato
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Kentaro Torii
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Hiroki Kobayashi
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Masaya Imoto
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
| | - Yasubumi Sakakibara
- Department of Biosciences and Informatics, Keio University, Yokohama, Japan
- * E-mail:
| |
Collapse
|
71
|
Martin S, Brown WM, Faulon JL. Using product kernels to predict protein interactions. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2007; 110:215-45. [PMID: 17922100 DOI: 10.1007/10_2007_084] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
There is a wide variety of experimental methods for the identification of protein interactions. This variety has in turn spurred the development of numerous different computational approaches for modeling and predicting protein interactions. These methods range from detailed structure-based methods capable of operating on only a single pair of proteins at a time to approximate statistical methods capable of making predictions on multiple proteomes simultaneously. In this chapter, we provide a brief discussion of the relative merits of different experimental and computational methods available for identifying protein interactions. Then we focus on the application of our particular (computational) method using Support Vector Machine product kernels. We describe our method in detail and discuss the application of the method for predicting protein-protein interactions, beta-strand interactions, and protein-chemical interactions.
Collapse
Affiliation(s)
- Shawn Martin
- Computational Biology, Sandia National Laboratories, PO Box 5800, 87185-1316, Albuquerque, NM 87185-1316, USA.
| | | | | |
Collapse
|