1
|
Zhou H, Cao H, Skolnick J. FRAGSITE: A Fragment-Based Approach for Virtual Ligand Screening. J Chem Inf Model 2021; 61:2074-2089. [PMID: 33724022 DOI: 10.1021/acs.jcim.0c01160] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
To reduce time and cost, virtual ligand screening (VLS) often precedes experimental ligand screening in modern drug discovery. Traditionally, high-resolution structure-based docking approaches rely on experimental structures, while ligand-based approaches need known binders to the target protein and only explore their nearby chemical space. In contrast, our structure-based FINDSITEcomb2.0 approach takes advantage of predicted, low-resolution structures and information from ligands that bind distantly related proteins whose binding sites are similar to the target protein. Using a boosted tree regression machine learning framework, we significantly improved FINDSITEcomb2.0 by integrating ligand fragment scores as encoded by molecular fingerprints with the global ligand similarity scores of FINDSITEcomb2.0. The new approach, FRAGSITE, exploits our observation that ligand fragments, e.g., rings, tend to interact with stereochemically conserved protein subpockets that also occur in evolutionarily unrelated proteins. FRAGSITE was benchmarked on the 102 protein DUD-E set, where any template protein whose sequence identify >30% to the target was excluded. Within the top 100 ranked molecules, FRAGSITE improves VLS precision and recall by 14.3 and 18.5%, respectively, relative to FINDSITEcomb2.0. Moreover, the mean top 1% enrichment factor increases from 25.2 to 30.2. On average, both outperform state-of-the-art deep learning-based methods such as AtomNet. On the more challenging unbiased set LIT-PCBA, FRAGSITE also shows better performance than ligand similarity-based and docking approaches such as two-dimensional ECFP4 and Surflex-Dock v.3066. On a subset of 23 targets from DEKOIS 2.0, FRAGSITE shows much better performance than the boosted tree regression-based, vScreenML scoring function. Experimental testing of FRAGSITE's predictions shows that it has more hits and covers a more diverse region of chemical space than FINDSITEcomb2.0. For the two proteins that were experimentally tested, DHFR, a well-studied protein that catalyzes the conversion of dihydrofolate to tetrahydrofolate, and the kinase ACVR1, FRAGSITE identified new small-molecule nanomolar binders. Interestingly, one new binder of DHFR is a kinase inhibitor predicted to bind in a new subpocket. For ACVR1, FRAGSITE identified new molecules that have diverse scaffolds and estimated nanomolar to micromolar affinities. Thus, FRAGSITE shows significant improvement over prior state-of-the-art ligand virtual screening approaches. A web server is freely available for academic users at http:/sites.gatech.edu/cssb/FRAGSITE.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| | - Hongnan Cao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| |
Collapse
|
2
|
Zhou H, Cao H, Skolnick J. FINDSITE comb2.0: A New Approach for Virtual Ligand Screening of Proteins and Virtual Target Screening of Biomolecules. J Chem Inf Model 2018; 58:2343-2354. [PMID: 30278128 PMCID: PMC6437778 DOI: 10.1021/acs.jcim.8b00309] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Computational approaches for predicting protein-ligand interactions can facilitate drug lead discovery and drug target determination. We have previously developed a threading/structural-based approach, FINDSITEcomb, for the virtual ligand screening of proteins that has been extensively experimentally validated. Even when low resolution predicted protein structures are employed, FINDSITEcomb has the advantage of being faster and more accurate than traditional high-resolution structure-based docking methods. It also overcomes the limitations of traditional QSAR methods that require a known set of seed ligands that bind to the given protein target. Here, we further improve FINDSITEcomb by enhancing its template ligand selection from the PDB/DrugBank/ChEMBL libraries of known protein-ligand interactions by (1) parsing the template proteins and their corresponding binding ligands in the DrugBank and ChEMBL libraries into domains so that the ligands with falsely matched domains to the targets will not be selected as template ligands; (2) applying various thresholds to filter out falsely matched template structures in the structure comparison process and thus their corresponding ligands for template ligand selection. With a sequence identity cutoff of 30% of target to templates and modeled target structures, FINDSITEcomb2.0 is shown to significantly improve upon FINDSITEcomb on the DUD-E benchmark set by increasing the 1% enrichment factor from 16.7 to 22.1, with a p-value of 4.3 × 10-3 by the Student t-test. With an 80% sequence identity cutoff of target to templates for the DUD-E set and modeled target structures, FINDSITEcomb2.0, having a 1% ROC enrichment factor of 52.39, also outperforms state-of-the-art methods that employ machine learning such as a deep convolutional neural network, CNN, with an enrichment of 29.65. Thus, FINDSITEcomb2.0 represents a significant improvement in the state-of-the-art. The FINDSITEcomb2.0 web service is freely available for academic users at http://pwp.gatech.edu/cssb/FINDSITE-COMB-2 .
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| | - Hongnan Cao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| |
Collapse
|
3
|
Assessing the similarity of ligand binding conformations with the Contact Mode Score. Comput Biol Chem 2016; 64:403-413. [PMID: 27620381 DOI: 10.1016/j.compbiolchem.2016.08.007] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/22/2022]
Abstract
Structural and computational biologists often need to measure the similarity of ligand binding conformations. The commonly used root-mean-square deviation (RMSD) is not only ligand-size dependent, but also may fail to capture biologically meaningful binding features. To address these issues, we developed the Contact Mode Score (CMS), a new metric to assess the conformational similarity based on intermolecular protein-ligand contacts. The CMS is less dependent on the ligand size and has the ability to include flexible receptors. In order to effectively compare binding poses of non-identical ligands bound to different proteins, we further developed the eXtended Contact Mode Score (XCMS). We believe that CMS and XCMS provide a meaningful assessment of the similarity of ligand binding conformations. CMS and XCMS are freely available at http://brylinski.cct.lsu.edu/content/contact-mode-score and http://geaux-computational-bio.github.io/contact-mode-score/.
Collapse
|
4
|
Skolnick J, Gao M, Roy A, Srinivasan B, Zhou H. Implications of the small number of distinct ligand binding pockets in proteins for drug discovery, evolution and biochemical function. Bioorg Med Chem Lett 2015; 25:1163-70. [PMID: 25690787 DOI: 10.1016/j.bmcl.2015.01.059] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 01/23/2015] [Accepted: 01/24/2015] [Indexed: 01/05/2023]
Abstract
Coincidence of the properties of ligand binding pockets in native proteins with those in proteins generated by computer simulations without selection for function shows that pockets are a generic protein feature and the number of distinct pockets is small. Similar pockets occur in unrelated protein structures, an observation successfully employed in pocket-based virtual ligand screening. The small number of pockets suggests that off-target interactions among diverse proteins are inherent; kinases, proteases and phosphatases show this prototypical behavior. The ability to repurpose FDA approved drugs is general, and minor side effects cannot be avoided. Finally, the implications to drug discovery are explored.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA.
| | - Mu Gao
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Ambrish Roy
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Bharath Srinivasan
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| |
Collapse
|
5
|
Vlachakis D, Champeris Tsaniras S, Tsiliki G, Megalooikonomou V, Kossida S. 3D structural analysis of proteins using electrostatic surfaces based on image segmentation. JOURNAL OF MOLECULAR BIOCHEMISTRY 2014; 3:27-33. [PMID: 27525250 PMCID: PMC4981338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Herein, we present a novel strategy to analyse and characterize proteins using protein molecular electro-static surfaces. Our approach starts by calculating a series of distinct molecular surfaces for each protein that are subsequently flattened out, thus reducing 3D information noise. RGB images are appropriately scaled by means of standard image processing techniques whilst retaining the weight information of each protein's molecular electrostatic surface. Then homogeneous areas in the protein surface are estimated based on unsupervised clustering of the 3D images, while performing similarity searches. This is a computationally fast approach, which efficiently highlights interesting structural areas among a group of proteins. Multiple protein electrostatic surfaces can be combined together and in conjunction with their processed images, they can provide the starting material for protein structural similarity and molecular docking experiments.
Collapse
Affiliation(s)
- Dimitrios Vlachakis
- Biomedical Research Foundation of the Academy of Athens, 11527, Athens, Greece
- Bionetwork ltd. 15234, Chalandri, Athens, Greece
- Computer Engineering and Informatics Department, School of Engineering, University of Patras, 26500 Patras, Greece
| | - Spyridon Champeris Tsaniras
- Bionetwork ltd. 15234, Chalandri, Athens, Greece
- Department of Physiology, Medical School, University of Patras, 26500 Patras, Greece
| | - Georgia Tsiliki
- Biomedical Research Foundation of the Academy of Athens, 11527, Athens, Greece
| | - Vasileios Megalooikonomou
- Computer Engineering and Informatics Department, School of Engineering, University of Patras, 26500 Patras, Greece
| | - Sophia Kossida
- Biomedical Research Foundation of the Academy of Athens, 11527, Athens, Greece
| |
Collapse
|
6
|
Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Protein models: the Grand Challenge of protein docking. Proteins 2013; 82:278-87. [PMID: 23934791 DOI: 10.1002/prot.24385] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2013] [Revised: 07/16/2013] [Accepted: 07/26/2013] [Indexed: 12/28/2022]
Abstract
Characterization of life processes at the molecular level requires structural details of protein-protein interactions (PPIs). The number of experimentally determined protein structures accounts only for a fraction of known proteins. This gap has to be bridged by modeling, typically using experimentally determined structures as templates to model related proteins. The fraction of experimentally determined PPI structures is even smaller than that for the individual proteins, due to a larger number of interactions than the number of individual proteins, and a greater difficulty of crystallizing protein-protein complexes. The approaches to structural modeling of PPI (docking) often have to rely on modeled structures of the interactors, especially in the case of large PPI networks. Structures of modeled proteins are typically less accurate than the ones determined by X-ray crystallography or nuclear magnetic resonance. Thus the utility of approaches to dock these structures should be assessed by thorough benchmarking, specifically designed for protein models. To be credible, such benchmarking has to be based on carefully curated sets of structures with levels of distortion typical for modeled proteins. This article presents such a suite of models built for the benchmark set of the X-ray structures from the Dockground resource (http://dockground.bioinformatics.ku.edu) by a combination of homology modeling and Nudged Elastic Band method. For each monomer, six models were generated with predefined C(α) root mean square deviation from the native structure (1, 2, …, 6 Å). The sets and the accompanying data provide a comprehensive resource for the development of docking methodology for modeled proteins.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047; United Institute of Informatics Problems, National Academy of Sciences, 220012, Minsk, Belarus
| | | | | | | |
Collapse
|
7
|
Skolnick J, Zhou H, Gao M. Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 2013; 23:191-7. [PMID: 23415854 DOI: 10.1016/j.sbi.2013.01.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 01/04/2013] [Accepted: 01/23/2013] [Indexed: 01/03/2023]
Abstract
The recently developed field of ligand homology modeling (LHM) that extends the ideas of protein homology modeling to the prediction of ligand binding sites and for use in virtual ligand screening has emerged as a powerful new approach. Unlike traditional docking methodologies, LHM can be applied to low-to-moderate resolution predicted as well as experimental structures with little if any diminution in performance; thereby enabling ≈ 75% of an average proteome to have potentially significant virtual screening predictions. In large scale benchmarking, LHM is able to predict off-target ligand binding. Thus, despite the widespread belief to the contrary, low-to-moderate resolution predicted structures have considerable utility for biochemical function prediction.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA.
| | | | | |
Collapse
|
8
|
Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 2013; 23:198-205. [PMID: 23294579 DOI: 10.1016/j.sbi.2012.12.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/03/2012] [Indexed: 11/23/2022]
Abstract
Structural characterization of protein-protein interactions across the broad spectrum of scales is key to our understanding of life at the molecular level. Low-resolution approach to protein interactions is needed for modeling large interaction networks, given the significant level of uncertainties in large biomolecular systems and the high-throughput nature of the task. Since only a fraction of protein structures in interactome are determined experimentally, protein docking approaches are increasingly focusing on modeled proteins. Current rapid advancement of template-based modeling of protein-protein complexes is following a long standing trend in structure prediction of individual proteins. Protein-protein templates are already available for almost all interactions of structurally characterized proteins, and about one third of such templates are likely correct.
Collapse
|
9
|
Zhou H, Skolnick J. FINDSITE(comb): a threading/structure-based, proteomic-scale virtual ligand screening approach. J Chem Inf Model 2012; 53:230-40. [PMID: 23240691 DOI: 10.1021/ci300510n] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Virtual ligand screening is an integral part of the modern drug discovery process. Traditional ligand-based, virtual screening approaches are fast but require a set of structurally diverse ligands known to bind to the target. Traditional structure-based approaches require high-resolution target protein structures and are computationally demanding. In contrast, the recently developed threading/structure-based FINDSITE-based approaches have the advantage that they are as fast as traditional ligand-based approaches and yet overcome the limitations of traditional ligand- or structure-based approaches. These new methods can use predicted low-resolution structures and infer the likelihood of a ligand binding to a target by utilizing ligand information excised from the target's remote or close homologous proteins and/or libraries of ligand binding databases. Here, we develop an improved version of FINDSITE, FINDSITE(filt), that filters out false positive ligands in threading identified templates by a better binding site detection procedure that includes information about the binding site amino acid similarity. We then combine FINDSITE(filt) with FINDSITE(X) that uses publicly available binding databases ChEMBL and DrugBank for virtual ligand screening. The combined approach, FINDSITE(comb), is compared to two traditional docking methods, AUTODOCK Vina and DOCK 6, on the DUD benchmark set. It is shown to be significantly better in terms of enrichment factor, dependence on target structure quality, and speed. FINDSITE(comb) is then tested for virtual ligand screening on a large set of 3576 generic targets from the DrugBank database as well as a set of 168 Human GPCRs. Excluding close homologues, FINDSITE(comb) gives an average enrichment factor of 52.1 for generic targets and 22.3 for GPCRs within the top 1% of the screened compound library. Around 65% of the targets have better than random enrichment factors. The performance is insensitive to target structure quality, as long as it has a TM-score ≥ 0.4 to native. Thus, FINDSITE(comb) makes the screening of millions of compounds across entire proteomes feasible. The FINDSITE(comb) web service is freely available for academic users at http://cssb.biology.gatech.edu/skolnick/webservice/FINDSITE-COMB/index.html.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, USA
| | | |
Collapse
|
10
|
Kaufmann KW, Meiler J. Using RosettaLigand for small molecule docking into comparative models. PLoS One 2012; 7:e50769. [PMID: 23239984 PMCID: PMC3519832 DOI: 10.1371/journal.pone.0050769] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/24/2012] [Indexed: 11/18/2022] Open
Abstract
Computational small molecule docking into comparative models of proteins is widely used to query protein function and in the development of small molecule therapeutics. We benchmark RosettaLigand docking into comparative models for nine proteins built during CASP8 that contain ligands. We supplement the study with 21 additional protein/ligand complexes to cover a wider space of chemotypes. During a full docking run in 21 of the 30 cases, RosettaLigand successfully found a native-like binding mode among the top ten scoring binding modes. From the benchmark cases we find that careful template selection based on ligand occupancy provides the best chance of success while overall sequence identity between template and target do not appear to improve results. We also find that binding energy normalized by atom number is often less than -0.4 in native-like binding modes.
Collapse
Affiliation(s)
- Kristian W. Kaufmann
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
11
|
eThread: a highly optimized machine learning-based approach to meta-threading and the modeling of protein tertiary structures. PLoS One 2012. [PMID: 23185577 PMCID: PMC3503980 DOI: 10.1371/journal.pone.0050200] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Template-based modeling that employs various meta-threading techniques is currently the most accurate, and consequently the most commonly used, approach for protein structure prediction. Despite the evident progress in this field, accurate structure models cannot be constructed for a significant fraction of gene products, thus the development of new algorithms is required. Here, we describe the development, optimization and large-scale benchmarking of eThread, a highly accurate meta-threading procedure for the identification of structural templates and the construction of corresponding target-to-template alignments. eThread integrates ten state-of-the-art threading/fold recognition algorithms in a local environment and extensively uses various machine learning techniques to carry out fully automated template-based protein structure modeling. Tertiary structure prediction employs two protocols based on widely used modeling algorithms: Modeller and TASSER-Lite. As a part of eThread, we also developed eContact, which is a Bayesian classifier for the prediction of inter-residue contacts and eRank, which effectively ranks generated multiple protein models and provides reliable confidence estimates as structure quality assessment. Excluding closely related templates from the modeling process, eThread generates models, which are correct at the fold level, for >80% of the targets; 40–50% of the constructed models are of a very high quality, which would be considered accurate at the family level. Furthermore, in large-scale benchmarking, we compare the performance of eThread to several alternative methods commonly used in protein structure prediction. Finally, we estimate the upper bound for this type of approach and discuss the directions towards further improvements.
Collapse
|
12
|
Zhou H, Skolnick J. FINDSITE(X): a structure-based, small molecule virtual screening approach with application to all identified human GPCRs. Mol Pharm 2012; 9:1775-84. [PMID: 22574683 DOI: 10.1021/mp3000716] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have developed FINDSITE(X), an extension of FINDSITE, a protein threading based algorithm for the inference of protein binding sites, biochemical function and virtual ligand screening, that removes the limitation that holo protein structures (those containing bound ligands) of a sufficiently large set of distant evolutionarily related proteins to the target be solved; rather, predicted protein structures and experimental ligand binding information are employed. To provide the predicted protein structures, a fast and accurate version of our recently developed TASSER(VMT), TASSER(VMT)-lite, for template-based protein structural modeling applicable up to 1000 residues is developed and tested, with comparable performance to the top CASP9 servers. Then, a hybrid approach that combines structure alignments with an evolutionary similarity score for identifying functional relationships between target and proteins with binding data has been developed. By way of illustration, FINDSITE(X) is applied to 998 identified human G-protein coupled receptors (GPCRs). First, TASSER(VMT)-lite provides updates of all human GPCR structures previously modeled in our lab. We then use these structures and the new function similarity detection algorithm to screen all human GPCRs against the ZINC8 nonredundant (TC < 0.7) ligand set combined with ligands from the GLIDA database (a total of 88,949 compounds). Testing (excluding GPCRs whose sequence identity > 30% to the target from the binding data library) on a 168 human GPCR set with known binding data, the average enrichment factor in the top 1% of the compound library (EF(0.01)) is 22.7, whereas EF(0.01) by FINDSITE is 7.1. For virtual screening when just the target and its native ligands are excluded, the average EF(0.01) reaches 41.4. We also analyze off-target interactions for the 168 protein test set. All predicted structures, virtual screening data and off-target interactions for the 998 human GPCRs are available at http://cssb.biology.gatech.edu/skolnick/webservice/gpcr/index.html .
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, United States
| | | |
Collapse
|
13
|
Lee HS, Zhang Y. BSP-SLIM: a blind low-resolution ligand-protein docking approach using predicted protein structures. Proteins 2011; 80:93-110. [PMID: 21971880 DOI: 10.1002/prot.23165] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Revised: 06/30/2011] [Accepted: 08/04/2011] [Indexed: 01/19/2023]
Abstract
We developed BSP-SLIM, a new method for ligand-protein blind docking using low-resolution protein structures. For a given sequence, protein structures are first predicted by I-TASSER; putative ligand binding sites are transferred from holo-template structures which are analogous to the I-TASSER models; ligand-protein docking conformations are then constructed by shape and chemical match of ligand with the negative image of binding pockets. BSP-SLIM was tested on 71 ligand-protein complexes from the Astex diverse set where the protein structures were predicted by I-TASSER with an average RMSD 2.92 Å on the binding residues. Using I-TASSER models, the median ligand RMSD of BSP-SLIM docking is 3.99 Å which is 5.94 Å lower than that by AutoDock; the median binding-site error by BSP-SLIM is 1.77 Å which is 6.23 Å lower than that by AutoDock and 3.43 Å lower than that by LIGSITE(CSC) . Compared to the models using crystal protein structures, the median ligand RMSD by BSP-SLIM using I-TASSER models increases by 0.87 Å, while that by AutoDock increases by 8.41 Å; the median binding-site error by BSP-SLIM increase by 0.69Å while that by AutoDock and LIGSITE(CSC) increases by 7.31 Å and 1.41 Å, respectively. As case studies, BSP-SLIM was used in virtual screening for six target proteins, which prioritized actives of 25% and 50% in the top 9.2% and 17% of the library on average, respectively. These results demonstrate the usefulness of the template-based coarse-grained algorithms in the low-resolution ligand-protein docking and drug-screening. An on-line BSP-SLIM server is freely available at http://zhanglab.ccmb.med.umich.edu/BSP-SLIM.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Biological Chemistry, Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
14
|
Xie L, Xie L, Bourne PE. Structure-based systems biology for analyzing off-target binding. Curr Opin Struct Biol 2011; 21:189-99. [PMID: 21292475 PMCID: PMC3070778 DOI: 10.1016/j.sbi.2011.01.004] [Citation(s) in RCA: 110] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/11/2011] [Accepted: 01/13/2011] [Indexed: 12/24/2022]
Abstract
Here off-target binding implies the binding of a small molecule of therapeutic interest to a protein target other than the primary target for which it was intended. Increasingly such off-targeting appears to be the norm rather than the exception, rational drug design notwithstanding, and can lead to detrimental side-effects, or opportunities to reposition a therapeutic agent to treat a different condition. Not surprisingly, there is significant interest in determining a priori what off-targets exist on a proteome-wide scale. Beyond determining putative off-targets is the need to understand the impact of such binding on the complete biological system, with the ultimate goal of being able to predict the phenotypic outcome. While a very ambitious goal, some progress is being made.
Collapse
Affiliation(s)
- Lei Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego MC9743, 9500 Gilman Drive, La Jolla, CA 92093, USA
- Department of Computer Science, Hunter College, the City University of New York, 695 Park Avenue, New York City, NY 10065, USA
| | - Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego MC9743, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Philip E. Bourne
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego MC9743, 9500 Gilman Drive, La Jolla, CA 92093, USA
| |
Collapse
|
15
|
Brylinski M, Skolnick J. Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening. J Chem Inf Model 2011; 50:1839-54. [PMID: 20853887 DOI: 10.1021/ci100235n] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The growing interest in the identification of kinase inhibitors, promising therapeutics in the treatment of many diseases, has created a demand for the structural characterization of the entire human kinome. At the outset of the drug development process, the lead-finding stage, approaches that enrich the screening library with bioactive compounds are needed. Here, protein structure based methods can play an important role, but despite structural genomics efforts, it is unlikely that the three-dimensional structures of the entire kinome will be available soon. Therefore, at the proteome level, structure-based approaches must rely on predicted models, with a key issue being their utility in virtual ligand screening. In this study, we employ the recently developed FINDSITE/Q-Dock ligand homology modeling approach, which is well-suited for proteome-scale applications using predicted structures, to provide extensive structural and functional characterization of the human kinome. Specifically, we construct structure models for the human kinome; these are subsequently subject to virtual screening against a library of more than 2 million compounds. To rank the compounds, we employ a hierarchical approach that combines ligand- and structure-based filters. Modeling accuracy is carefully validated using available experimental data with particularly encouraging results found for the ability to identify, without prior knowledge, specific kinase inhibitors. More generally, the modeling procedure results in a large number of predicted molecular interactions between kinases and small ligands that should be of practical use in the development of novel inhibitors. The data set is freely available to the academic community via a user-friendly Web interface at http://cssb.biology.gatech.edu/kinomelhm/ as well as at the ZINC Web site ( http://zinc.docking.org/applications/2010Apr/Brylinski-2010.tar.gz ).
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
16
|
Brylinski M, Skolnick J. Cross-reactivity virtual profiling of the human kinome by X-react(KIN): a chemical systems biology approach. Mol Pharm 2010; 7:2324-33. [PMID: 20958088 DOI: 10.1021/mp1002976] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Many drug candidates fail in clinical development due to their insufficient selectivity that may cause undesired side effects. Therefore, modern drug discovery is routinely supported by computational techniques, which can identify alternate molecular targets with a significant potential for cross-reactivity. In particular, the development of highly selective kinase inhibitors is complicated by the strong conservation of the ATP-binding site across the kinase family. In this paper, we describe X-React(KIN), a new machine learning approach that extends the modeling and virtual screening of individual protein kinases to a system level in order to construct a cross-reactivity virtual profile for the human kinome. To maximize the coverage of the kinome, X-React(KIN) relies solely on the predicted target structures and employs state-of-the-art modeling techniques. Benchmark tests carried out against available selectivity data from high-throughput kinase profiling experiments demonstrate that, for almost 70% of the inhibitors, their alternate molecular targets can be effectively identified in the human kinome with a high (>0.5) sensitivity at the expense of a relatively low false positive rate (<0.5). Furthermore, in a case study, we demonstrate how X-React(KIN) can support the development of selective inhibitors by optimizing the selection of kinase targets for small-scale counter-screen experiments. The constructed cross-reactivity profiles for the human kinome are freely available to the academic community at http://cssb.biology.gatech.edu/kinomelhm/ .
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | | |
Collapse
|
17
|
Brylinski M, Lee SY, Zhou H, Skolnick J. The utility of geometrical and chemical restraint information extracted from predicted ligand-binding sites in protein structure refinement. J Struct Biol 2010; 173:558-69. [PMID: 20850544 DOI: 10.1016/j.jsb.2010.09.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Revised: 09/08/2010] [Accepted: 09/10/2010] [Indexed: 01/01/2023]
Abstract
Exhaustive exploration of molecular interactions at the level of complete proteomes requires efficient and reliable computational approaches to protein function inference. Ligand docking and ranking techniques show considerable promise in their ability to quantify the interactions between proteins and small molecules. Despite the advances in the development of docking approaches and scoring functions, the genome-wide application of many ligand docking/screening algorithms is limited by the quality of the binding sites in theoretical receptor models constructed by protein structure prediction. In this study, we describe a new template-based method for the local refinement of ligand-binding regions in protein models using remotely related templates identified by threading. We designed a Support Vector Regression (SVR) model that selects correct binding site geometries in a large ensemble of multiple receptor conformations. The SVR model employs several scoring functions that impose geometrical restraints on the Cα positions, account for the specific chemical environment within a binding site and optimize the interactions with putative ligands. The SVR score is well correlated with the RMSD from the native structure; in 47% (70%) of the cases, the Pearson's correlation coefficient is >0.5 (>0.3). When applied to weakly homologous models, the average heavy atom, local RMSD from the native structure of the top-ranked (best of top five) binding site geometries is 3.1Å (2.9Å) for roughly half of the targets; this represents a 0.1 (0.3)Å average improvement over the original predicted structure. Focusing on the subset of strongly conserved residues, the average heavy atom RMSD is 2.6Å (2.3Å). Furthermore, we estimate the upper bound of template-based binding site refinement using only weakly related proteins to be ∼2.6Å RMSD. This value also corresponds to the plasticity of the ligand-binding regions in distant homologues. The Binding Site Refinement (BSR) approach is available to the scientific community as a web server that can be accessed at http://cssb.biology.gatech.edu/bsr/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, GA 30318, USA
| | | | | | | |
Collapse
|