51
|
Lee HS, Im W. Ligand binding site detection by local structure alignment and its performance complementarity. J Chem Inf Model 2013; 53:2462-70. [PMID: 23957286 DOI: 10.1021/ci4003602] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands' positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a nontemplate geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Molecular Biosciences and Center for Bioinformatics, The University of Kansas , 2030 Becker Drive, Lawrence, Kansas 66047, United States
| | | |
Collapse
|
52
|
Khar KR, Goldschmidt L, Karanicolas J. Fast docking on graphics processing units via Ray-Casting. PLoS One 2013; 8:e70661. [PMID: 23976948 PMCID: PMC3745428 DOI: 10.1371/journal.pone.0070661] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 06/20/2013] [Indexed: 11/22/2022] Open
Abstract
Docking Approach using Ray Casting (DARC) is structure-based computational method for carrying out virtual screening by docking small-molecules into protein surface pockets. In a complementary study we find that DARC can be used to identify known inhibitors from large sets of decoy compounds, and can identify new compounds that are active in biochemical assays. Here, we describe our adaptation of DARC for use on Graphics Processing Units (GPUs), leading to a speedup of approximately 27-fold in typical-use cases over the corresponding calculations carried out using a CPU alone. This dramatic speedup of DARC will enable screening larger compound libraries, screening with more conformations of each compound, and including multiple receptor conformations when screening. We anticipate that all three of these enhanced approaches, which now become tractable, will lead to improved screening results.
Collapse
Affiliation(s)
- Karen R. Khar
- Center for Bioinformatics, University of Kansas, Lawrence, Kansas, United States of America
| | - Lukasz Goldschmidt
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, California, United States of America
| | - John Karanicolas
- Center for Bioinformatics, University of Kansas, Lawrence, Kansas, United States of America
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
- * E-mail:
| |
Collapse
|
53
|
Skolnick J, Zhou H, Gao M. Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 2013; 23:191-7. [PMID: 23415854 DOI: 10.1016/j.sbi.2013.01.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 01/04/2013] [Accepted: 01/23/2013] [Indexed: 01/03/2023]
Abstract
The recently developed field of ligand homology modeling (LHM) that extends the ideas of protein homology modeling to the prediction of ligand binding sites and for use in virtual ligand screening has emerged as a powerful new approach. Unlike traditional docking methodologies, LHM can be applied to low-to-moderate resolution predicted as well as experimental structures with little if any diminution in performance; thereby enabling ≈ 75% of an average proteome to have potentially significant virtual screening predictions. In large scale benchmarking, LHM is able to predict off-target ligand binding. Thus, despite the widespread belief to the contrary, low-to-moderate resolution predicted structures have considerable utility for biochemical function prediction.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA.
| | | | | |
Collapse
|
54
|
Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 2013; 23:198-205. [PMID: 23294579 DOI: 10.1016/j.sbi.2012.12.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/03/2012] [Indexed: 11/23/2022]
Abstract
Structural characterization of protein-protein interactions across the broad spectrum of scales is key to our understanding of life at the molecular level. Low-resolution approach to protein interactions is needed for modeling large interaction networks, given the significant level of uncertainties in large biomolecular systems and the high-throughput nature of the task. Since only a fraction of protein structures in interactome are determined experimentally, protein docking approaches are increasingly focusing on modeled proteins. Current rapid advancement of template-based modeling of protein-protein complexes is following a long standing trend in structure prediction of individual proteins. Protein-protein templates are already available for almost all interactions of structurally characterized proteins, and about one third of such templates are likely correct.
Collapse
|
55
|
Zhou H, Skolnick J. FINDSITE(comb): a threading/structure-based, proteomic-scale virtual ligand screening approach. J Chem Inf Model 2012; 53:230-40. [PMID: 23240691 DOI: 10.1021/ci300510n] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Virtual ligand screening is an integral part of the modern drug discovery process. Traditional ligand-based, virtual screening approaches are fast but require a set of structurally diverse ligands known to bind to the target. Traditional structure-based approaches require high-resolution target protein structures and are computationally demanding. In contrast, the recently developed threading/structure-based FINDSITE-based approaches have the advantage that they are as fast as traditional ligand-based approaches and yet overcome the limitations of traditional ligand- or structure-based approaches. These new methods can use predicted low-resolution structures and infer the likelihood of a ligand binding to a target by utilizing ligand information excised from the target's remote or close homologous proteins and/or libraries of ligand binding databases. Here, we develop an improved version of FINDSITE, FINDSITE(filt), that filters out false positive ligands in threading identified templates by a better binding site detection procedure that includes information about the binding site amino acid similarity. We then combine FINDSITE(filt) with FINDSITE(X) that uses publicly available binding databases ChEMBL and DrugBank for virtual ligand screening. The combined approach, FINDSITE(comb), is compared to two traditional docking methods, AUTODOCK Vina and DOCK 6, on the DUD benchmark set. It is shown to be significantly better in terms of enrichment factor, dependence on target structure quality, and speed. FINDSITE(comb) is then tested for virtual ligand screening on a large set of 3576 generic targets from the DrugBank database as well as a set of 168 Human GPCRs. Excluding close homologues, FINDSITE(comb) gives an average enrichment factor of 52.1 for generic targets and 22.3 for GPCRs within the top 1% of the screened compound library. Around 65% of the targets have better than random enrichment factors. The performance is insensitive to target structure quality, as long as it has a TM-score ≥ 0.4 to native. Thus, FINDSITE(comb) makes the screening of millions of compounds across entire proteomes feasible. The FINDSITE(comb) web service is freely available for academic users at http://cssb.biology.gatech.edu/skolnick/webservice/FINDSITE-COMB/index.html.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, USA
| | | |
Collapse
|
56
|
Yang J, Roy A, Zhang Y. BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 2012; 41:D1096-103. [PMID: 23087378 PMCID: PMC3531193 DOI: 10.1093/nar/gks966] [Citation(s) in RCA: 454] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
BioLiP (http://zhanglab.ccmb.med.umich.edu/BioLiP/) is a semi-manually curated database for biologically relevant ligand–protein interactions. Establishing interactions between protein and biologically relevant ligands is an important step toward understanding the protein functions. Most ligand-binding sites prediction methods use the protein structures from the Protein Data Bank (PDB) as templates. However, not all ligands present in the PDB are biologically relevant, as small molecules are often used as additives for solving the protein structures. To facilitate template-based ligand–protein docking, virtual ligand screening and protein function annotations, we develop a hierarchical procedure for assessing the biological relevance of ligands present in the PDB structures, which involves a four-step biological feature filtering followed by careful manual verifications. This procedure is used for BioLiP construction. Each entry in BioLiP contains annotations on: ligand-binding residues, ligand-binding affinity, catalytic sites, Enzyme Commission numbers, Gene Ontology terms and cross-links to the other databases. In addition, to facilitate the use of BioLiP for function annotation of uncharacterized proteins, a new consensus-based algorithm COACH is developed to predict ligand-binding sites from protein sequence or using 3D structure. The BioLiP database is updated weekly and the current release contains 204 223 entries.
Collapse
Affiliation(s)
- Jianyi Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | | | | |
Collapse
|
57
|
Lee HS, Im W. Identification of ligand templates using local structure alignment for structure-based drug design. J Chem Inf Model 2012; 52:2784-95. [PMID: 22978550 DOI: 10.1021/ci300178e] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed a local structure alignment tool, G-LoSA (graph-based local structure alignment). The known protein-ligand binding-site structure library is searched by G-LoSA to detect binding-site structures with similar geometry and physicochemical properties to a query binding-site structure regardless of sequence continuity and protein fold. Then, the ligands in the identified complexes are used as templates (i.e., template ligands) to predict/design a ligand for the target protein. The performance of G-LoSA is validated against 76 benchmark targets from the Astex diverse set. Using the currently available protein-ligand structure library, G-LoSA is able to identify a single template ligand (from a nonhomologous protein complex) that is highly similar to the target ligand in more than half of the benchmark targets. In addition, our benchmark analyses show that an assembly of structural fragments from multiple template ligands with partial similarity to the target ligand can be used to design novel ligand structures specific to the target protein. This study clearly indicates that a template-based ligand modeling has potential for de novo ligand design and can be a complementary approach to the receptor structure based methods.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Molecular Biosciences and Center for Bioinformatics, The University of Kansas, 2030 Becker Drive, Lawrence, Kansas 66047, USA.
| | | |
Collapse
|
58
|
Lee HS, Jo S, Lim HS, Im W. Application of binding free energy calculations to prediction of binding modes and affinities of MDM2 and MDMX inhibitors. J Chem Inf Model 2012; 52:1821-32. [PMID: 22731511 DOI: 10.1021/ci3000997] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Molecular docking is widely used to obtain binding modes and binding affinities of a molecule to a given target protein. Despite considerable efforts, however, prediction of both properties by docking remains challenging mainly due to protein's structural flexibility and inaccuracy of scoring functions. Here, an integrated approach has been developed to improve the accuracy of binding mode and affinity prediction and tested for small molecule MDM2 and MDMX antagonists. In this approach, initial candidate models selected from docking are subjected to equilibration MD simulations to further filter the models. Free energy perturbation molecular dynamics (FEP/MD) simulations are then applied to the filtered ligand models to enhance the ability in predicting the near-native ligand conformation. The calculated binding free energies for MDM2 complexes are overestimated compared to experimental measurements mainly due to the difficulties in sampling highly flexible apo-MDM2. Nonetheless, the FEP/MD binding free energy calculations are more promising for discriminating binders from nonbinders than docking scores. In particular, the comparison between the MDM2 and MDMX results suggests that apo-MDMX has lower flexibility than apo-MDM2. In addition, the FEP/MD calculations provide detailed information on the different energetic contributions to ligand binding, leading to a better understanding of the sensitivity and specificity of protein-ligand interactions.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Molecular Biosciences and Center for Bioinformatics, The University of Kansas, 2030 Becker Drive Lawrence, Kansas 66045, United States
| | | | | | | |
Collapse
|
59
|
Zhou H, Skolnick J. FINDSITE(X): a structure-based, small molecule virtual screening approach with application to all identified human GPCRs. Mol Pharm 2012; 9:1775-84. [PMID: 22574683 DOI: 10.1021/mp3000716] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have developed FINDSITE(X), an extension of FINDSITE, a protein threading based algorithm for the inference of protein binding sites, biochemical function and virtual ligand screening, that removes the limitation that holo protein structures (those containing bound ligands) of a sufficiently large set of distant evolutionarily related proteins to the target be solved; rather, predicted protein structures and experimental ligand binding information are employed. To provide the predicted protein structures, a fast and accurate version of our recently developed TASSER(VMT), TASSER(VMT)-lite, for template-based protein structural modeling applicable up to 1000 residues is developed and tested, with comparable performance to the top CASP9 servers. Then, a hybrid approach that combines structure alignments with an evolutionary similarity score for identifying functional relationships between target and proteins with binding data has been developed. By way of illustration, FINDSITE(X) is applied to 998 identified human G-protein coupled receptors (GPCRs). First, TASSER(VMT)-lite provides updates of all human GPCR structures previously modeled in our lab. We then use these structures and the new function similarity detection algorithm to screen all human GPCRs against the ZINC8 nonredundant (TC < 0.7) ligand set combined with ligands from the GLIDA database (a total of 88,949 compounds). Testing (excluding GPCRs whose sequence identity > 30% to the target from the binding data library) on a 168 human GPCR set with known binding data, the average enrichment factor in the top 1% of the compound library (EF(0.01)) is 22.7, whereas EF(0.01) by FINDSITE is 7.1. For virtual screening when just the target and its native ligands are excluded, the average EF(0.01) reaches 41.4. We also analyze off-target interactions for the 168 protein test set. All predicted structures, virtual screening data and off-target interactions for the 998 human GPCRs are available at http://cssb.biology.gatech.edu/skolnick/webservice/gpcr/index.html .
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, United States
| | | |
Collapse
|