51
|
Meslamani J, Rognan D. Enhancing the Accuracy of Chemogenomic Models with a Three-Dimensional Binding Site Kernel. J Chem Inf Model 2011; 51:1593-603. [DOI: 10.1021/ci200166t] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jamel Meslamani
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| | - Didier Rognan
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| |
Collapse
|
52
|
Koutsoukas A, Simms B, Kirchmair J, Bond PJ, Whitmore AV, Zimmer S, Young MP, Jenkins JL, Glick M, Glen RC, Bender A. From in silico target prediction to multi-target drug design: current databases, methods and applications. J Proteomics 2011; 74:2554-74. [PMID: 21621023 DOI: 10.1016/j.jprot.2011.05.011] [Citation(s) in RCA: 186] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Revised: 04/10/2011] [Accepted: 05/06/2011] [Indexed: 01/31/2023]
Abstract
Given the tremendous growth of bioactivity databases, the use of computational tools to predict protein targets of small molecules has been gaining importance in recent years. Applications span a wide range, from the 'designed polypharmacology' of compounds to mode-of-action analysis. In this review, we firstly survey databases that can be used for ligand-based target prediction and which have grown tremendously in size in the past. We furthermore outline methods for target prediction that exist, both based on the knowledge of bioactivities from the ligand side and methods that can be applied in situations when a protein structure is known. Applications of successful in silico target identification attempts are discussed in detail, which were based partly or in whole on computational target predictions in the first instance. This includes the authors' own experience using target prediction tools, in this case considering phenotypic antibacterial screens and the analysis of high-throughput screening data. Finally, we will conclude with the prospective application of databases to not only predict, retrospectively, the protein targets of a small molecule, but also how to design ligands with desired polypharmacology in a prospective manner.
Collapse
Affiliation(s)
- Alexios Koutsoukas
- Unilever Centre for Molecular Sciences Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
53
|
Yamanishi Y, Pauwels E, Saigo H, Stoven V. Extracting sets of chemical substructures and protein domains governing drug-target interactions. J Chem Inf Model 2011; 51:1183-94. [PMID: 21506615 DOI: 10.1021/ci100476q] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The identification of rules governing molecular recognition between drug chemical substructures and protein functional sites is a challenging issue at many stages of the drug development process. In this paper we develop a novel method to extract sets of drug chemical substructures and protein domains that govern drug-target interactions on a genome-wide scale. This is made possible using sparse canonical correspondence analysis (SCCA) for analyzing drug substructure profiles and protein domain profiles simultaneously. The method does not depend on the availability of protein 3D structures. From a data set of known drug-target interactions including enzymes, ion channels, G protein-coupled receptors, and nuclear receptors, we extract a set of chemical substructures shared by drugs able to bind to a set of protein domains. These two sets of extracted chemical substructures and protein domains form components that can be further exploited in a drug discovery process. This approach successfully clusters protein domains that may be evolutionary unrelated but that bind a common set of chemical substructures. As shown in several examples, it can also be very helpful for predicting new protein-ligand interactions and addressing the problem of ligand specificity. The proposed method constitutes a contribution to the recent field of chemogenomics that aims to connect the chemical space with the biological space.
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Mines ParisTech , Centre for Computational Biology, 35 rue Saint-Honore, F-77305 Fontainebleau Cedex, France, Institut Curie, F-75248, Paris, France, and INSERM U900, F-75248 Paris, France
| | | | | | | |
Collapse
|
54
|
Shim JY. Understanding functional residues of the cannabinoid CB1. Curr Top Med Chem 2011; 10:779-98. [PMID: 20370713 DOI: 10.2174/156802610791164210] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2009] [Accepted: 10/27/2009] [Indexed: 02/07/2023]
Abstract
The brain cannabinoid (CB(1)) receptor that mediates numerous physiological processes in response to marijuana and other psychoactive compounds is a G protein coupled receptor (GPCR) and shares common structural features with many rhodopsin class GPCRs. For the rational development of therapeutic agents targeting the CB(1) receptor, understanding of the ligand-specific CB(1) receptor interactions responsible for unique G protein signals is crucial. For a more than a decade, a combination of mutagenesis and computational modeling approaches has been successfully employed to study the ligand-specific CB(1) receptor interactions. In this review, after a brief discussion about recent advances in understanding of some structural and functional features of GPCRs commonly applicable to the CB(1) receptor, the CB(1) receptor functional residues reported from mutational studies are divided into three different types, ligand binding (B), receptor stabilization (S) and receptor activation (A) residues, to delineate the nature of the binding pockets of anandamide, CP55940, WIN55212-2 and SR141716A and to describe the molecular events of the ligand-specific CB(1) receptor activation from ligand binding to G protein signaling. Taken these CB(1) receptor functional residues, some of which are unique to the CB(1) receptor, together with the biophysical knowledge accumulated for the GPCR active state, it is possible to propose the early stages of the CB(1) receptor activation process that not only provide some insights into understanding molecular mechanisms of receptor activation but also are applicable for identifying new therapeutic agents by applying the validated structure-based approaches, such as virtual high throughput screening (HTS) and fragment-based approach (FBA).
Collapse
Affiliation(s)
- Joong-Youn Shim
- J.L. Chambers Biomedical/Biotechnology Research Institute, North Carolina Central University, 700 George Street, Durham, NC 27707, USA.
| |
Collapse
|
55
|
Jacoby E. Computational chemogenomics. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.11] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Edgar Jacoby
- Novartis Institutes for BioMedical Research, Center for Proteomic Chemistry, Forum 1, Novartis Campus, Basel, Switzerland
| |
Collapse
|
56
|
Briansó F, Carrascosa MC, Oprea TI, Mestres J. Cross-pharmacology analysis of G protein-coupled receptors. Curr Top Med Chem 2011; 11:1956-63. [PMID: 21851335 PMCID: PMC3717414 DOI: 10.2174/156802611796391285] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Accepted: 06/24/2011] [Indexed: 11/22/2022]
Abstract
The degree of applicability of chemogenomic approaches to protein families depends on the accuracy and completeness of pharmacological data and the corresponding level of pharmacological similarity observed among their protein members. The recent public domain availability of pharmacological data for thousands of small molecules on 204 G protein-coupled receptors (GPCRs) provides a firm basis for an in-depth cross-pharmacology analysis of this superfamily. The number of protein targets included in the cross-pharmacology profile of the different GPCRs changes significantly upon varying the ligand similarity and binding affinity criteria. However, with the exception of muscarinic receptors, aminergic GPCRs distinguish themselves from the rest of the members in the family by their remarkably high levels of pharmacological similarity among them. Clusters of non-GPCR targets related by cross-pharmacology with particular GPCRs are identified and the implications for unwanted side-effects, as well as for repurposing opportunities, discussed.
Collapse
Affiliation(s)
- Ferran Briansó
- Chemogenomics Laboratory, Research Unit on Biomedical Informatics, Institut Municipal d'Investigació Mèdica and Universitat Pompeu Fabra, Pare de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Maria C. Carrascosa
- Chemogenomics Laboratory, Research Unit on Biomedical Informatics, Institut Municipal d'Investigació Mèdica and Universitat Pompeu Fabra, Pare de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Tudor I. Oprea
- Division of Biocomputing, Department of Biochemistry & Molecular Biology and UNM Center for Molecular Discovery, University of New Mexico School of Medicine, MSC11 6145, Albuquerque NM 87131, USA
| | - Jordi Mestres
- Chemogenomics Laboratory, Research Unit on Biomedical Informatics, Institut Municipal d'Investigació Mèdica and Universitat Pompeu Fabra, Pare de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
57
|
van Westen GJP, Wegner JK, IJzerman AP, van Vlijmen HWT, Bender A. Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets. MEDCHEMCOMM 2011. [DOI: 10.1039/c0md00165a] [Citation(s) in RCA: 123] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Proteochemometric modeling is founded on the principles of QSAR but is able to benefit from additional information in model training due to the inclusion of target information.
Collapse
Affiliation(s)
- Gerard J. P. van Westen
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
| | | | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
| | - Herman W. T. van Vlijmen
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
- Tibotec BVBA
| | - A. Bender
- Division of Medicinal Chemistry
- Leiden/Amsterdam Center for Drug Research
- Leiden
- The Netherlands
- Unilever Centre for Molecular Science Informatics
| |
Collapse
|
58
|
Wang X, Huan J, Smalter A, Lushington GH. Application of kernel functions for accurate similarity search in large chemical databases. BMC Bioinformatics 2010; 11 Suppl 3:S8. [PMID: 20438655 PMCID: PMC2863067 DOI: 10.1186/1471-2105-11-s3-s8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Similarity search in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. RESULTS To bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep. CONCLUSIONS Efficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.
Collapse
Affiliation(s)
- Xiaohong Wang
- School of Electrical Engineering and Computer Science University of Kansas, Lawrence, Kansas, 66045, USA
| | - Jun Huan
- School of Electrical Engineering and Computer Science University of Kansas, Lawrence, Kansas, 66045, USA
| | - Aaron Smalter
- School of Electrical Engineering and Computer Science University of Kansas, Lawrence, Kansas, 66045, USA
| | - Gerald H Lushington
- Molecular Graphics and Modeling Laboratory,University of Kansas, Lawrence, Kansas, 66045, USA
| |
Collapse
|
59
|
Franco R, Canela EI, Casado V, Ferre S. Platforms for the identification of GPCR targets, and of orthosteric and allosteric modulators. Expert Opin Drug Discov 2010; 5:391-403. [DOI: 10.1517/17460441003653163] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
60
|
Rognan D. Structure-Based Approaches to Target Fishing and Ligand Profiling. Mol Inform 2010; 29:176-87. [PMID: 27462761 DOI: 10.1002/minf.200900081] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2009] [Accepted: 02/03/2010] [Indexed: 11/11/2022]
Abstract
Chemogenomics is an emerging interdisciplinary field aiming at identifying all possible ligands of all possible targets. If one groups targets in columns and ligands in rows, chemogenomic approaches to drug discovery just fill the interaction matrix. Since experimental data do not suffice, several computational methods are currently actively developed to supplement time-consuming and costly experiments. They are either designed to fill rows and thus profile a ligand towards a heterogeneous set of targets (target profiling) or to fill columns and thus identify novel ligands for an existing target (standard virtual screening). At the interface of both strategies are now true chemogenomic computational methods filling well defined areas in the matrix. The present review will focus on (protein) structure-based approaches and illustrates major advances in this novel exciting field which is supposed to massively impact rational drug design in the next decade.
Collapse
Affiliation(s)
- Didier Rognan
- Structural Chemogenomics, UMR 7200 CNRS-UdS, 74 route du Rhin, F-67400 Illlkirch phone: +33.3.68854235 fax: +33.3.68854310.
| |
Collapse
|
61
|
|
62
|
Smalter A, Huan J, Lushington G. Feature Selection in the Tensor Product Feature Space. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON DATA MINING 2009:1004-1009. [PMID: 24632658 DOI: 10.1109/icdm.2009.101] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Classifying objects that are sampled jointly from two or more domains has many applications. The tensor product feature space is useful for modeling interactions between feature sets in different domains but feature selection in the tensor product feature space is challenging. Conventional feature selection methods ignore the structure of the feature space and may not provide the optimal results. In this paper we propose methods for selecting features in the original feature spaces of different domains. We obtained sparsity through two approaches, one using integer quadratic programming and another using L1-norm regularization. Experimental studies on biological data sets validate our approach.
Collapse
Affiliation(s)
- Aaron Smalter
- Department of Electrical Engineering and Computer Science, University of Kansas Lawrence, Kansas, United States
| | - Jun Huan
- Department of Electrical Engineering and Computer Science, University of Kansas Lawrence, Kansas, United States
| | - Gerald Lushington
- Molecular Graphics and Modeling Laboratory, University of Kansas Lawrence, Kansas, United States
| |
Collapse
|
63
|
Ning X, Rangwala H, Karypis G. Multi-Assay-Based Structure−Activity Relationship Models: Improving Structure−Activity Relationship Models by Incorporating Activity Information from Related Targets. J Chem Inf Model 2009; 49:2444-56. [DOI: 10.1021/ci900182q] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Xia Ning
- Department of Computer Science and Computer Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union Street SE, Minneapolis, Minnesota 55455 and Department of Computer Science, George Mason University, 4400 University Drive MSN 4A5, Fairfax, Virginia 22030
| | - Huzefa Rangwala
- Department of Computer Science and Computer Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union Street SE, Minneapolis, Minnesota 55455 and Department of Computer Science, George Mason University, 4400 University Drive MSN 4A5, Fairfax, Virginia 22030
| | - George Karypis
- Department of Computer Science and Computer Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union Street SE, Minneapolis, Minnesota 55455 and Department of Computer Science, George Mason University, 4400 University Drive MSN 4A5, Fairfax, Virginia 22030
| |
Collapse
|
64
|
Weill N, Rognan D. Development and Validation of a Novel Protein−Ligand Fingerprint To Mine Chemogenomic Space: Application to G Protein-Coupled Receptors and Their Ligands. J Chem Inf Model 2009; 49:1049-62. [DOI: 10.1021/ci800447g] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Nathanael Weill
- Structural Chemogenomics Group, Laboratory of Therapeutic Inovation, UMR 7200 CNRS-UdS (Université de Strasbourg), 74 route du Rhin, B.P.24, F-67400 Illkirch, France
| | - Didier Rognan
- Structural Chemogenomics Group, Laboratory of Therapeutic Inovation, UMR 7200 CNRS-UdS (Université de Strasbourg), 74 route du Rhin, B.P.24, F-67400 Illkirch, France
| |
Collapse
|
65
|
Vert JP, Jacob L. Machine learning for in silico virtual screening and chemical genomics: new strategies. Comb Chem High Throughput Screen 2008; 11:677-85. [PMID: 18795887 PMCID: PMC2748698 DOI: 10.2174/138620708785739899] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Support vector machines and kernel methods belong to the same class of machine learning algorithms that has recently become prominent in both computational biology and chemistry, although both fields have largely ignored each other. These methods are based on a sound mathematical and computationally efficient framework that implicitly embeds the data of interest, respectively proteins and small molecules, in high-dimensional feature spaces where various classification or regression tasks can be performed with linear algorithms. In this review, we present the main ideas underlying these approaches, survey how both the “biological” and the “chemical” spaces have been separately constructed using the same mathematical framework and tricks, and suggest different avenues to unify both spaces for the purpose of in silico chemogenomics.
Collapse
Affiliation(s)
- Jean-Philippe Vert
- Centre for Computational Biology, Mines ParisTech, 35 rue, Saint-Honoré, France.
| | | |
Collapse
|
66
|
Jacob L, Vert JP. Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 2008; 24:2149-56. [PMID: 18676415 PMCID: PMC2553441 DOI: 10.1093/bioinformatics/btn409] [Citation(s) in RCA: 215] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Revised: 06/17/2008] [Accepted: 07/30/2008] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Predicting interactions between small molecules and proteins is a crucial step to decipher many biological processes, and plays a critical role in drug discovery. When no detailed 3D structure of the protein target is available, ligand-based virtual screening allows the construction of predictive models by learning to discriminate known ligands from non-ligands. However, the accuracy of ligand-based models quickly degrades when the number of known ligands decreases, and in particular the approach is not applicable for orphan receptors with no known ligand. RESULTS We propose a systematic method to predict ligand-protein interactions, even for targets with no known 3D structure and few or no known ligands. Following the recent chemogenomics trend, we adopt a cross-target view and attempt to screen the chemical space against whole families of proteins simultaneously. The lack of known ligand for a given target can then be compensated by the availability of known ligands for similar targets. We test this strategy on three important classes of drug targets, namely enzymes, G-protein-coupled receptors (GPCR) and ion channels, and report dramatic improvements in prediction accuracy over classical ligand-based virtual screening, in particular for targets with few or no known ligands. AVAILABILITY All data and algorithms are available as Supplementary Material.
Collapse
Affiliation(s)
- Laurent Jacob
- Mines ParisTech, Centre for Computational Biology, 35 rue Saint Honoré, F-77305 Fontainebleau, Institut Curie and INSERM, U900, F-75248, Paris, France.
| | | |
Collapse
|