1
|
Kowaguchi A, Endo K, Brumby PE, Nomura K, Yasuoka K. Optimal Replica-Exchange Molecular Simulations in Combination with Evolution Strategies. J Chem Inf Model 2022; 62:6544-6552. [PMID: 35785994 DOI: 10.1021/acs.jcim.2c00608] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
We have incorporated Evolution Strategies into the Replica-Exchange Monte Carlo simulation method to predict the phase behavior of several example fluids. The replica-exchange method allows one system to exchange temperatures with its neighbors to search for the most stable structure relatively efficiently in a single simulation. However, if the temperature intervals of the replicas are not positioned carefully, there is an issue that local exchange does not occur. Our results for a simple Lennard-Jones fluid and the liquid-crystal Yukawa model demonstrate the utility of the approach when compared to conventional methods. When Evolution Strategies were applied to the Replica-Exchange Monte Carlo simulation, the problem of a significant localized decrease in exchange probability near the phase transition was avoided. By obtaining the optimal temperature intervals, the system efficiently traverses a broader parameter space with a small number of replicas. This is equivalent to accelerating molecular simulations with limited computational resources and can be useful when attempting to predict the phase behavior of complex systems.
Collapse
Affiliation(s)
- Akie Kowaguchi
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Katsuhiro Endo
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Paul E Brumby
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Kentaro Nomura
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| | - Kenji Yasuoka
- Department of Mechanical Engineering, Keio University, Yokohama 223-8522, Japan
| |
Collapse
|
2
|
Skolnick J, Zhou H. Implications of the Essential Role of Small Molecule Ligand Binding Pockets in Protein-Protein Interactions. J Phys Chem B 2022; 126:6853-6867. [PMID: 36044742 PMCID: PMC9484464 DOI: 10.1021/acs.jpcb.2c04525] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/18/2022] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) and protein-metabolite interactions play a key role in many biochemical processes, yet they are often viewed as being independent. However, the fact that small molecule drugs have been successful in inhibiting PPIs suggests a deeper relationship between protein pockets that bind small molecules and PPIs. We demonstrate that 2/3 of PPI interfaces, including antibody-epitope interfaces, contain at least one significant small molecule ligand binding pocket. In a representative library of 50 distinct protein-protein interactions involving hundreds of mutations, >75% of hot spot residues overlap with small molecule ligand binding pockets. Hence, ligand binding pockets play an essential role in PPIs. In representative cases, evolutionary unrelated monomers that are involved in different multimeric interactions yet share the same pocket are predicted to bind the same metabolites/drugs; these results are confirmed by examples in the PDB. Thus, the binding of a metabolite can shift the equilibrium between monomers and multimers. This implicit coupling of PPI equilibria, termed "metabolic entanglement", was successfully employed to suggest novel functional relationships among protein multimers that do not directly interact. Thus, the current work provides an approach to unify metabolomics and protein interactomics.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| | - Hongyi Zhou
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| |
Collapse
|
3
|
Çalışkaner ZO. Computational discovery of novel inhibitory candidates targeting versatile transcriptional repressor MBD2. J Mol Model 2022; 28:296. [PMID: 36066769 DOI: 10.1007/s00894-022-05297-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 08/29/2022] [Indexed: 11/30/2022]
Abstract
Genome methylation is a key epigenetic mechanism in various biological events such as development, cellular differentiation, cancer progression, aging, and iPSC reprogramming. Crosstalk between DNA methylation and gene expression is mediated by MBD2, known as the reader of DNA methylation and suggested as a drug target. Despite its magnitude of significance, a scarcely limited number of small molecules to be used as inhibitors have been detected so far. Therefore, we screened a comprehensive compound library to elicit MBD2 inhibitor candidates. Promising molecules were subjected to computational docking analysis by targeting the methylated DNA-binding domain of human MBD2. We could detect reasonable binding energies and docking residues, presumably located in druggable pockets. Docking results were also validated via MD simulation and per-residue energy decomposition calculation. Drug-likeness of these small molecules was assessed through ADMET prediction to foresee off-target side effects for future studies. All computational approaches notably highlighted two compounds named CID3100583 and 8,8-ethylenebistheophylline. These compounds have become prominent as novel candidates, possibly disrupting MBD2MBD-DNA interaction. Consequently, these compounds have been considered prospective inhibitors with the usage potential in a wide range of applications from cancer treatment to somatic cell reprogramming protocols.
Collapse
Affiliation(s)
- Zihni Onur Çalışkaner
- Faculty of Engineering and Natural Sciences, Molecular Biology and Genetics Department, Biruni University, 34010, Istanbul, Turkey.
| |
Collapse
|
4
|
Wang X, Hu J, Song L, Rong E, Yang C, Chen X, Pu J, Sun H, Gao C, Burt DW, Liu J, Li N, Huang Y. Functional divergence of oligoadenylate synthetase 1 (OAS1) proteins in Tetrapods. SCIENCE CHINA. LIFE SCIENCES 2022; 65:1395-1412. [PMID: 34826092 DOI: 10.1007/s11427-021-2002-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Accepted: 08/25/2021] [Indexed: 06/13/2023]
Abstract
OASs play critical roles in immune response against virus infection by polymerizing ATP into 2-5As, which initiate the classical OAS/RNase L pathway and induce degradation of viral RNA. OAS members are functionally diverged in four known innate immune pathways (OAS/RNase L, OASL/IRF7, OASL/RIG-I, and OASL/cGAS), but how they functionally diverged is unclear. Here, we focus on evolutionary patterns and explore the link between evolutionary processes and functional divergence of Tetrapod OAS1. We show that Palaeognathae and Primate OAS1 genes are conserved in genomic and protein structures but differ in function. The former (i.e., ostrich) efficiently synthesized long 2-5A and activated RNase L, while the latter (i.e., human) synthesized short 2-5A and did not activate RNase L. We predicted and verified that two in-frame indels and one positively selected site in the active site pocket contributed to the functional divergence of Palaeognathae and Primate OAS1. Moreover, we discovered and validated that an in-frame indel in the C-terminus of Palaeognathae OAS1 affected the binding affinity of dsRNA and enzymatic activity, and contributed to the functional divergence of Palaeognathae OAS1 proteins. Our findings unravel the molecular mechanism for functional divergence and give insights into the emergence of novel functions in Tetrapod OAS1.
Collapse
Affiliation(s)
- Xiaoxue Wang
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - Jiaxiang Hu
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - Linfei Song
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - Enguang Rong
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - Chenghuai Yang
- China Institute of Veterinary Drug Control, Beijing, 100081, China
| | - Xiaoyun Chen
- China Institute of Veterinary Drug Control, Beijing, 100081, China
| | - Juan Pu
- Key Laboratory of Animal Epidemiology and Zoonosis, Ministry of Agriculture, College of Veterinary Medicine, China Agricultural University, Beijing, 100083, China
| | - Honglei Sun
- Key Laboratory of Animal Epidemiology and Zoonosis, Ministry of Agriculture, College of Veterinary Medicine, China Agricultural University, Beijing, 100083, China
| | - Chuze Gao
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - David W Burt
- University of Queensland, St. Lucia, Brisbane, QLD, 4072, Australia
| | - Jinhua Liu
- Key Laboratory of Animal Epidemiology and Zoonosis, Ministry of Agriculture, College of Veterinary Medicine, China Agricultural University, Beijing, 100083, China
| | - Ning Li
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China
| | - Yinhua Huang
- State Key Laboratory for Agrobiotechnology, College of Biology Sciences, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
5
|
Assessing sequence-based protein-protein interaction predictors for use in therapeutic peptide engineering. Sci Rep 2022; 12:9610. [PMID: 35688894 PMCID: PMC9187631 DOI: 10.1038/s41598-022-13227-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/25/2022] [Indexed: 12/01/2022] Open
Abstract
Engineering peptides to achieve a desired therapeutic effect through the inhibition of a specific target activity or protein interaction is a non-trivial task. Few of the existing in silico peptide design algorithms generate target-specific peptides. Instead, many methods produce peptides that achieve a desired effect through an unknown mechanism. In contrast with resource-intensive high-throughput experiments, in silico screening is a cost-effective alternative that can prune the space of candidates when engineering target-specific peptides. Using a set of FDA-approved peptides we curated specifically for this task, we assess the applicability of several sequence-based protein–protein interaction predictors as a screening tool within the context of peptide therapeutic engineering. We show that similarity-based protein–protein interaction predictors are more suitable for this purpose than the state-of-the-art deep learning methods publicly available at the time of writing. We also show that this approach is mostly useful when designing new peptides against targets for which naturally-occurring interactors are already known, and that deploying it for de novo peptide engineering tasks may require gathering additional target-specific training data. Taken together, this work offers evidence that supports the use of similarity-based protein–protein interaction predictors for peptide therapeutic engineering, especially peptide analogs.
Collapse
|
6
|
Brown BP, Vu O, Geanes AR, Kothiwale S, Butkiewicz M, Lowe EW, Mueller R, Pape R, Mendenhall J, Meiler J. Introduction to the BioChemical Library (BCL): An Application-Based Open-Source Toolkit for Integrated Cheminformatics and Machine Learning in Computer-Aided Drug Discovery. Front Pharmacol 2022; 13:833099. [PMID: 35264967 PMCID: PMC8899505 DOI: 10.3389/fphar.2022.833099] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/24/2022] [Indexed: 01/31/2023] Open
Abstract
The BioChemical Library (BCL) cheminformatics toolkit is an application-based academic open-source software package designed to integrate traditional small molecule cheminformatics tools with machine learning-based quantitative structure-activity/property relationship (QSAR/QSPR) modeling. In this pedagogical article we provide a detailed introduction to core BCL cheminformatics functionality, showing how traditional tasks (e.g., computing chemical properties, estimating druglikeness) can be readily combined with machine learning. In addition, we have included multiple examples covering areas of advanced use, such as reaction-based library design. We anticipate that this manuscript will be a valuable resource for researchers in computer-aided drug discovery looking to integrate modular cheminformatics and machine learning tools into their pipelines.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Chemical and Physical Biology Program, Medical Scientist Training Program, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Oanh Vu
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Alexander R. Geanes
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Sandeepkumar Kothiwale
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Mariusz Butkiewicz
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Edward W. Lowe
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Ralf Mueller
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Richard Pape
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Jeffrey Mendenhall
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Jens Meiler
- Department of Chemistry, Departments of Pharmacology and Biomedical Informatics, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
- Institute for Drug Discovery, Leipzig University Medical School, Leipzig, Germany
| |
Collapse
|
7
|
Zhou H, Cao H, Skolnick J. FRAGSITE: A Fragment-Based Approach for Virtual Ligand Screening. J Chem Inf Model 2021; 61:2074-2089. [PMID: 33724022 DOI: 10.1021/acs.jcim.0c01160] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
To reduce time and cost, virtual ligand screening (VLS) often precedes experimental ligand screening in modern drug discovery. Traditionally, high-resolution structure-based docking approaches rely on experimental structures, while ligand-based approaches need known binders to the target protein and only explore their nearby chemical space. In contrast, our structure-based FINDSITEcomb2.0 approach takes advantage of predicted, low-resolution structures and information from ligands that bind distantly related proteins whose binding sites are similar to the target protein. Using a boosted tree regression machine learning framework, we significantly improved FINDSITEcomb2.0 by integrating ligand fragment scores as encoded by molecular fingerprints with the global ligand similarity scores of FINDSITEcomb2.0. The new approach, FRAGSITE, exploits our observation that ligand fragments, e.g., rings, tend to interact with stereochemically conserved protein subpockets that also occur in evolutionarily unrelated proteins. FRAGSITE was benchmarked on the 102 protein DUD-E set, where any template protein whose sequence identify >30% to the target was excluded. Within the top 100 ranked molecules, FRAGSITE improves VLS precision and recall by 14.3 and 18.5%, respectively, relative to FINDSITEcomb2.0. Moreover, the mean top 1% enrichment factor increases from 25.2 to 30.2. On average, both outperform state-of-the-art deep learning-based methods such as AtomNet. On the more challenging unbiased set LIT-PCBA, FRAGSITE also shows better performance than ligand similarity-based and docking approaches such as two-dimensional ECFP4 and Surflex-Dock v.3066. On a subset of 23 targets from DEKOIS 2.0, FRAGSITE shows much better performance than the boosted tree regression-based, vScreenML scoring function. Experimental testing of FRAGSITE's predictions shows that it has more hits and covers a more diverse region of chemical space than FINDSITEcomb2.0. For the two proteins that were experimentally tested, DHFR, a well-studied protein that catalyzes the conversion of dihydrofolate to tetrahydrofolate, and the kinase ACVR1, FRAGSITE identified new small-molecule nanomolar binders. Interestingly, one new binder of DHFR is a kinase inhibitor predicted to bind in a new subpocket. For ACVR1, FRAGSITE identified new molecules that have diverse scaffolds and estimated nanomolar to micromolar affinities. Thus, FRAGSITE shows significant improvement over prior state-of-the-art ligand virtual screening approaches. A web server is freely available for academic users at http:/sites.gatech.edu/cssb/FRAGSITE.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| | - Hongnan Cao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332-2000, United States
| |
Collapse
|
8
|
Paul L, Mudogo CN, Mtei KM, Machunda RL, Ntie-Kang F. A computer-based approach for developing linamarase inhibitory agents. PHYSICAL SCIENCES REVIEWS 2020. [DOI: 10.1515/psr-2019-0098] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractCassava is a strategic crop, especially for developing countries. However, the presence of cyanogenic compounds in cassava products limits the proper nutrients utilization. Due to the poor availability of structure discovery and elucidation in the Protein Data Bank is limiting the full understanding of the enzyme, how to inhibit it and applications in different fields. There is a need to solve the three-dimensional structure (3-D) of linamarase from cassava. The structural elucidation will allow the development of a competitive inhibitor and various industrial applications of the enzyme. The goal of this review is to summarize and present the available 3-D modeling structure of linamarase enzyme using different computational strategies. This approach could help in determining the structure of linamarase and later guide the structure elucidationin silicoand experimentally.
Collapse
Affiliation(s)
- Lucas Paul
- The Department of Materials and Energy Science & Engineering, The Nelson Mandela African Institution of Science and Technology, P.O. Box 447Arusha, Tanzania
- Department of Chemistry, Dar es Salaam University College of Education, P.O. Box 2329, 255Dar es Salaam, Tanzania
| | - Celestin N. Mudogo
- Biochemistry and Molecularbiology, University of Hamburg Institute of Biochemistry and Molecularbiology, Hamburg, Germany
- Department of Basic Sciences, School of Medicine, University of Kinshasa, Kinshasa, Congo (Democratic Republic of the)
| | - Kelvin M. Mtei
- The Department of Water and Environmental Science and Engineering, The Nelson Mandela African Institution of Science and Technology, P.O. Box 447Arusha, Tanzania
| | - Revocatus L. Machunda
- The Department of Water and Environmental Science and Engineering, The Nelson Mandela African Institution of Science and Technology, P.O. Box 447Arusha, Tanzania
| | - Fidele Ntie-Kang
- Department of Pharmaceutical Chemistry, Martin-Luther University Halle-Wittenberg, Wolfgang-Langenbeck Str. 4, Halle (Saale)06120, Germany
- Department of Informatics and Chemistry, University of Chemistry and Technology Prague, Technická 5, Prague 6, Dejvice 166 28, Czech Republic
- Department of Chemistry, University of Buea, P. O. Box 63Buea, Cameroon
| |
Collapse
|
9
|
Zhou H, Cao H, Skolnick J. FINDSITE comb2.0: A New Approach for Virtual Ligand Screening of Proteins and Virtual Target Screening of Biomolecules. J Chem Inf Model 2018; 58:2343-2354. [PMID: 30278128 PMCID: PMC6437778 DOI: 10.1021/acs.jcim.8b00309] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Computational approaches for predicting protein-ligand interactions can facilitate drug lead discovery and drug target determination. We have previously developed a threading/structural-based approach, FINDSITEcomb, for the virtual ligand screening of proteins that has been extensively experimentally validated. Even when low resolution predicted protein structures are employed, FINDSITEcomb has the advantage of being faster and more accurate than traditional high-resolution structure-based docking methods. It also overcomes the limitations of traditional QSAR methods that require a known set of seed ligands that bind to the given protein target. Here, we further improve FINDSITEcomb by enhancing its template ligand selection from the PDB/DrugBank/ChEMBL libraries of known protein-ligand interactions by (1) parsing the template proteins and their corresponding binding ligands in the DrugBank and ChEMBL libraries into domains so that the ligands with falsely matched domains to the targets will not be selected as template ligands; (2) applying various thresholds to filter out falsely matched template structures in the structure comparison process and thus their corresponding ligands for template ligand selection. With a sequence identity cutoff of 30% of target to templates and modeled target structures, FINDSITEcomb2.0 is shown to significantly improve upon FINDSITEcomb on the DUD-E benchmark set by increasing the 1% enrichment factor from 16.7 to 22.1, with a p-value of 4.3 × 10-3 by the Student t-test. With an 80% sequence identity cutoff of target to templates for the DUD-E set and modeled target structures, FINDSITEcomb2.0, having a 1% ROC enrichment factor of 52.39, also outperforms state-of-the-art methods that employ machine learning such as a deep convolutional neural network, CNN, with an enrichment of 29.65. Thus, FINDSITEcomb2.0 represents a significant improvement in the state-of-the-art. The FINDSITEcomb2.0 web service is freely available for academic users at http://pwp.gatech.edu/cssb/FINDSITE-COMB-2 .
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| | - Hongnan Cao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332-2000
| |
Collapse
|
10
|
Inhibition of protein interactions: co-crystalized protein-protein interfaces are nearly as good as holo proteins in rigid-body ligand docking. J Comput Aided Mol Des 2018; 32:769-779. [PMID: 30003468 DOI: 10.1007/s10822-018-0124-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Accepted: 05/22/2018] [Indexed: 12/15/2022]
Abstract
Modulating protein interaction pathways may lead to the cure of many diseases. Known protein-protein inhibitors bind to large pockets on the protein-protein interface. Such large pockets are detected also in the protein-protein complexes without known inhibitors, making such complexes potentially druggable. The inhibitor-binding site is primary defined by the side chains that form the largest pocket in the protein-bound conformation. Low-resolution ligand docking shows that the success rate for the protein-bound conformation is close to the one for the ligand-bound conformation, and significantly higher than for the apo conformation. The conformational change on the protein interface upon binding to the other protein results in a pocket employed by the ligand when it binds to that interface. This proof-of-concept study suggests that rather than using computational pocket-opening procedures, one can opt for an experimentally determined structure of the target co-crystallized protein-protein complex as a starting point for drug design.
Collapse
|
11
|
Assessing the similarity of ligand binding conformations with the Contact Mode Score. Comput Biol Chem 2016; 64:403-413. [PMID: 27620381 DOI: 10.1016/j.compbiolchem.2016.08.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/22/2022]
Abstract
Structural and computational biologists often need to measure the similarity of ligand binding conformations. The commonly used root-mean-square deviation (RMSD) is not only ligand-size dependent, but also may fail to capture biologically meaningful binding features. To address these issues, we developed the Contact Mode Score (CMS), a new metric to assess the conformational similarity based on intermolecular protein-ligand contacts. The CMS is less dependent on the ligand size and has the ability to include flexible receptors. In order to effectively compare binding poses of non-identical ligands bound to different proteins, we further developed the eXtended Contact Mode Score (XCMS). We believe that CMS and XCMS provide a meaningful assessment of the similarity of ligand binding conformations. CMS and XCMS are freely available at http://brylinski.cct.lsu.edu/content/contact-mode-score and http://geaux-computational-bio.github.io/contact-mode-score/.
Collapse
|
12
|
Fang Y, Ding Y, Feinstein WP, Koppelman DM, Moreno J, Jarrell M, Ramanujam J, Brylinski M. GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing. PLoS One 2016; 11:e0158898. [PMID: 27420300 PMCID: PMC4946785 DOI: 10.1371/journal.pone.0158898] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 06/23/2016] [Indexed: 12/19/2022] Open
Abstract
Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.
Collapse
Affiliation(s)
- Ye Fang
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Yun Ding
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Wei P. Feinstein
- High-Performance Computing, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - David M. Koppelman
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Juana Moreno
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Mark Jarrell
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - J. Ramanujam
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
13
|
Ding Y, Fang Y, Feinstein WP, Ramanujam J, Koppelman DM, Moreno J, Brylinski M, Jarrell M. GeauxDock: A novel approach for mixed-resolution ligand docking using a descriptor-based force field. J Comput Chem 2015; 36:2013-26. [PMID: 26250822 DOI: 10.1002/jcc.24031] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Revised: 06/07/2015] [Accepted: 07/03/2015] [Indexed: 12/26/2022]
Abstract
Molecular docking is an important component of computer-aided drug discovery. In this communication, we describe GeauxDock, a new docking approach that builds on the ideas of ligand homology modeling. GeauxDock features a descriptor-based scoring function integrating evolutionary constraints with physics-based energy terms, a mixed-resolution molecular representation of protein-ligand complexes, and an efficient Monte Carlo sampling protocol. To drive docking simulations toward experimental conformations, the scoring function was carefully optimized to produce a correlation between the total pseudoenergy and the native-likeness of binding poses. Indeed, benchmarking calculations demonstrate that GeauxDock has a strong capacity to identify near-native conformations across docking trajectories with the area under receiver operating characteristics of 0.85. By excluding closely related templates, we show that GeauxDock maintains its accuracy at lower levels of homology through the increased contribution from physics-based energy terms compensating for weak evolutionary constraints. GeauxDock is available at http://www.institute.loni.org/lasigma/package/dock/.
Collapse
Affiliation(s)
- Yun Ding
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Ye Fang
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Wei P Feinstein
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Jagannathan Ramanujam
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - David M Koppelman
- School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Juana Moreno
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Michal Brylinski
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Mark Jarrell
- Department of Physics and Astronomy, Louisiana State University, Baton Rouge, Louisiana, 70803.,Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, 70803
| |
Collapse
|
14
|
Brylinski M. Nonlinear Scoring Functions for Similarity-Based Ligand Docking and Binding Affinity Prediction. J Chem Inf Model 2013; 53:3097-112. [DOI: 10.1021/ci400510e] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
- Michal Brylinski
- Department of Biological
Sciences, Louisiana State University, Baton Rouge, Louisiana 70803, United States
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana 70803, United States
| |
Collapse
|
15
|
Bello M, Martínez-Archundia M, Correa-Basurto J. Automated docking for novel drug discovery. Expert Opin Drug Discov 2013; 8:821-34. [DOI: 10.1517/17460441.2013.794780] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
16
|
Skolnick J, Zhou H, Gao M. Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 2013; 23:191-7. [PMID: 23415854 DOI: 10.1016/j.sbi.2013.01.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 01/04/2013] [Accepted: 01/23/2013] [Indexed: 01/03/2023]
Abstract
The recently developed field of ligand homology modeling (LHM) that extends the ideas of protein homology modeling to the prediction of ligand binding sites and for use in virtual ligand screening has emerged as a powerful new approach. Unlike traditional docking methodologies, LHM can be applied to low-to-moderate resolution predicted as well as experimental structures with little if any diminution in performance; thereby enabling ≈ 75% of an average proteome to have potentially significant virtual screening predictions. In large scale benchmarking, LHM is able to predict off-target ligand binding. Thus, despite the widespread belief to the contrary, low-to-moderate resolution predicted structures have considerable utility for biochemical function prediction.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA.
| | | | | |
Collapse
|
17
|
Zhou H, Skolnick J. FINDSITE(comb): a threading/structure-based, proteomic-scale virtual ligand screening approach. J Chem Inf Model 2012; 53:230-40. [PMID: 23240691 DOI: 10.1021/ci300510n] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Virtual ligand screening is an integral part of the modern drug discovery process. Traditional ligand-based, virtual screening approaches are fast but require a set of structurally diverse ligands known to bind to the target. Traditional structure-based approaches require high-resolution target protein structures and are computationally demanding. In contrast, the recently developed threading/structure-based FINDSITE-based approaches have the advantage that they are as fast as traditional ligand-based approaches and yet overcome the limitations of traditional ligand- or structure-based approaches. These new methods can use predicted low-resolution structures and infer the likelihood of a ligand binding to a target by utilizing ligand information excised from the target's remote or close homologous proteins and/or libraries of ligand binding databases. Here, we develop an improved version of FINDSITE, FINDSITE(filt), that filters out false positive ligands in threading identified templates by a better binding site detection procedure that includes information about the binding site amino acid similarity. We then combine FINDSITE(filt) with FINDSITE(X) that uses publicly available binding databases ChEMBL and DrugBank for virtual ligand screening. The combined approach, FINDSITE(comb), is compared to two traditional docking methods, AUTODOCK Vina and DOCK 6, on the DUD benchmark set. It is shown to be significantly better in terms of enrichment factor, dependence on target structure quality, and speed. FINDSITE(comb) is then tested for virtual ligand screening on a large set of 3576 generic targets from the DrugBank database as well as a set of 168 Human GPCRs. Excluding close homologues, FINDSITE(comb) gives an average enrichment factor of 52.1 for generic targets and 22.3 for GPCRs within the top 1% of the screened compound library. Around 65% of the targets have better than random enrichment factors. The performance is insensitive to target structure quality, as long as it has a TM-score ≥ 0.4 to native. Thus, FINDSITE(comb) makes the screening of millions of compounds across entire proteomes feasible. The FINDSITE(comb) web service is freely available for academic users at http://cssb.biology.gatech.edu/skolnick/webservice/FINDSITE-COMB/index.html.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, USA
| | | |
Collapse
|
18
|
Kaufmann KW, Meiler J. Using RosettaLigand for small molecule docking into comparative models. PLoS One 2012; 7:e50769. [PMID: 23239984 PMCID: PMC3519832 DOI: 10.1371/journal.pone.0050769] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/24/2012] [Indexed: 11/18/2022] Open
Abstract
Computational small molecule docking into comparative models of proteins is widely used to query protein function and in the development of small molecule therapeutics. We benchmark RosettaLigand docking into comparative models for nine proteins built during CASP8 that contain ligands. We supplement the study with 21 additional protein/ligand complexes to cover a wider space of chemotypes. During a full docking run in 21 of the 30 cases, RosettaLigand successfully found a native-like binding mode among the top ten scoring binding modes. From the benchmark cases we find that careful template selection based on ligand occupancy provides the best chance of success while overall sequence identity between template and target do not appear to improve results. We also find that binding energy normalized by atom number is often less than -0.4 in native-like binding modes.
Collapse
Affiliation(s)
- Kristian W. Kaufmann
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
19
|
Carbajo D, Tramontano A. A resource for benchmarking the usefulness of protein structure models. BMC Bioinformatics 2012; 13:188. [PMID: 22856649 PMCID: PMC3473236 DOI: 10.1186/1471-2105-13-188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2012] [Accepted: 07/16/2012] [Indexed: 01/13/2023] Open
Abstract
Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php. Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.
Collapse
Affiliation(s)
- Daniel Carbajo
- Department of Physics, Sapienza University of Rome, P,le A, Moro, 5, 00185 Rome, Italy
| | | |
Collapse
|
20
|
Zhou H, Skolnick J. FINDSITE(X): a structure-based, small molecule virtual screening approach with application to all identified human GPCRs. Mol Pharm 2012; 9:1775-84. [PMID: 22574683 DOI: 10.1021/mp3000716] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have developed FINDSITE(X), an extension of FINDSITE, a protein threading based algorithm for the inference of protein binding sites, biochemical function and virtual ligand screening, that removes the limitation that holo protein structures (those containing bound ligands) of a sufficiently large set of distant evolutionarily related proteins to the target be solved; rather, predicted protein structures and experimental ligand binding information are employed. To provide the predicted protein structures, a fast and accurate version of our recently developed TASSER(VMT), TASSER(VMT)-lite, for template-based protein structural modeling applicable up to 1000 residues is developed and tested, with comparable performance to the top CASP9 servers. Then, a hybrid approach that combines structure alignments with an evolutionary similarity score for identifying functional relationships between target and proteins with binding data has been developed. By way of illustration, FINDSITE(X) is applied to 998 identified human G-protein coupled receptors (GPCRs). First, TASSER(VMT)-lite provides updates of all human GPCR structures previously modeled in our lab. We then use these structures and the new function similarity detection algorithm to screen all human GPCRs against the ZINC8 nonredundant (TC < 0.7) ligand set combined with ligands from the GLIDA database (a total of 88,949 compounds). Testing (excluding GPCRs whose sequence identity > 30% to the target from the binding data library) on a 168 human GPCR set with known binding data, the average enrichment factor in the top 1% of the compound library (EF(0.01)) is 22.7, whereas EF(0.01) by FINDSITE is 7.1. For virtual screening when just the target and its native ligands are excluded, the average EF(0.01) reaches 41.4. We also analyze off-target interactions for the 168 protein test set. All predicted structures, virtual screening data and off-target interactions for the 998 human GPCRs are available at http://cssb.biology.gatech.edu/skolnick/webservice/gpcr/index.html .
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street, N.W., Atlanta, Georgia 30318, United States
| | | |
Collapse
|
21
|
Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan R. Status of GPCR modeling and docking as reflected by community-wide GPCR Dock 2010 assessment. Structure 2011; 19:1108-26. [PMID: 21827947 DOI: 10.1016/j.str.2011.05.012] [Citation(s) in RCA: 228] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Revised: 05/24/2011] [Accepted: 05/28/2011] [Indexed: 12/19/2022]
Abstract
The community-wide GPCR Dock assessment is conducted to evaluate the status of molecular modeling and ligand docking for human G protein-coupled receptors. The present round of the assessment was based on the recent structures of dopamine D3 and CXCR4 chemokine receptors bound to small molecule antagonists and CXCR4 with a synthetic cyclopeptide. Thirty-five groups submitted their receptor-ligand complex structure predictions prior to the release of the crystallographic coordinates. With closely related homology modeling templates, as for dopamine D3 receptor, and with incorporation of biochemical and QSAR data, modern computational techniques predicted complex details with accuracy approaching experimental. In contrast, CXCR4 complexes that had less-characterized interactions and only distant homology to the known GPCR structures still remained very challenging. The assessment results provide guidance for modeling and crystallographic communities in method development and target selection for further expansion of the structural coverage of the GPCR universe.
Collapse
Affiliation(s)
- Irina Kufareva
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92039, USA
| | | | | | | | | |
Collapse
|
22
|
Menon R, Roy A, Mukherjee S, Belkin S, Zhang Y, Omenn GS. Functional implications of structural predictions for alternative splice proteins expressed in Her2/neu-induced breast cancers. J Proteome Res 2011; 10:5503-11. [PMID: 22003824 DOI: 10.1021/pr200772w] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Alternative splicing allows a single gene to generate multiple mRNA transcripts, which can be translated into functionally diverse proteins. However, experimentally determined structures of protein splice isoforms are rare, and homology modeling methods are poor at predicting atomic-level structural differences because of high sequence identity. Here we exploit the state-of-the-art structure prediction method I-TASSER to analyze the structural and functional consequences of alternative splicing of proteins differentially expressed in a breast cancer model. We first successfully benchmarked the I-TASSER pipeline for structure modeling of all seven pairs of protein splice isoforms, which are known to have experimentally solved structures. We then modeled three cancer-related variant pairs reported to have opposite functions. In each pair, we observed structural differences in regions where the presence or absence of a motif can directly influence the distinctive functions of the variants. Finally, we applied the method to five splice variants overexpressed in mouse Her2/neu mammary tumor: anxa6, calu, cdc42, ptbp1, and tax1bp3. Despite >75% sequence identity between the variants, structural differences were observed in biologically important regions of these protein pairs. These results demonstrate the feasibility of integrating proteomic analysis with structure-based conformational predictions of differentially expressed alternative splice variants in cancers and other conditions.
Collapse
Affiliation(s)
- Rajasree Menon
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States.
| | | | | | | | | | | |
Collapse
|
23
|
Brylinski M, Skolnick J. Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening. J Chem Inf Model 2011; 50:1839-54. [PMID: 20853887 DOI: 10.1021/ci100235n] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The growing interest in the identification of kinase inhibitors, promising therapeutics in the treatment of many diseases, has created a demand for the structural characterization of the entire human kinome. At the outset of the drug development process, the lead-finding stage, approaches that enrich the screening library with bioactive compounds are needed. Here, protein structure based methods can play an important role, but despite structural genomics efforts, it is unlikely that the three-dimensional structures of the entire kinome will be available soon. Therefore, at the proteome level, structure-based approaches must rely on predicted models, with a key issue being their utility in virtual ligand screening. In this study, we employ the recently developed FINDSITE/Q-Dock ligand homology modeling approach, which is well-suited for proteome-scale applications using predicted structures, to provide extensive structural and functional characterization of the human kinome. Specifically, we construct structure models for the human kinome; these are subsequently subject to virtual screening against a library of more than 2 million compounds. To rank the compounds, we employ a hierarchical approach that combines ligand- and structure-based filters. Modeling accuracy is carefully validated using available experimental data with particularly encouraging results found for the ability to identify, without prior knowledge, specific kinase inhibitors. More generally, the modeling procedure results in a large number of predicted molecular interactions between kinases and small ligands that should be of practical use in the development of novel inhibitors. The data set is freely available to the academic community via a user-friendly Web interface at http://cssb.biology.gatech.edu/kinomelhm/ as well as at the ZINC Web site ( http://zinc.docking.org/applications/2010Apr/Brylinski-2010.tar.gz ).
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
24
|
Brylinski M, Skolnick J. Cross-reactivity virtual profiling of the human kinome by X-react(KIN): a chemical systems biology approach. Mol Pharm 2010; 7:2324-33. [PMID: 20958088 DOI: 10.1021/mp1002976] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Many drug candidates fail in clinical development due to their insufficient selectivity that may cause undesired side effects. Therefore, modern drug discovery is routinely supported by computational techniques, which can identify alternate molecular targets with a significant potential for cross-reactivity. In particular, the development of highly selective kinase inhibitors is complicated by the strong conservation of the ATP-binding site across the kinase family. In this paper, we describe X-React(KIN), a new machine learning approach that extends the modeling and virtual screening of individual protein kinases to a system level in order to construct a cross-reactivity virtual profile for the human kinome. To maximize the coverage of the kinome, X-React(KIN) relies solely on the predicted target structures and employs state-of-the-art modeling techniques. Benchmark tests carried out against available selectivity data from high-throughput kinase profiling experiments demonstrate that, for almost 70% of the inhibitors, their alternate molecular targets can be effectively identified in the human kinome with a high (>0.5) sensitivity at the expense of a relatively low false positive rate (<0.5). Furthermore, in a case study, we demonstrate how X-React(KIN) can support the development of selective inhibitors by optimizing the selection of kinase targets for small-scale counter-screen experiments. The constructed cross-reactivity profiles for the human kinome are freely available to the academic community at http://cssb.biology.gatech.edu/kinomelhm/ .
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | | |
Collapse
|
25
|
Brylinski M, Lee SY, Zhou H, Skolnick J. The utility of geometrical and chemical restraint information extracted from predicted ligand-binding sites in protein structure refinement. J Struct Biol 2010; 173:558-69. [PMID: 20850544 DOI: 10.1016/j.jsb.2010.09.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Revised: 09/08/2010] [Accepted: 09/10/2010] [Indexed: 01/01/2023]
Abstract
Exhaustive exploration of molecular interactions at the level of complete proteomes requires efficient and reliable computational approaches to protein function inference. Ligand docking and ranking techniques show considerable promise in their ability to quantify the interactions between proteins and small molecules. Despite the advances in the development of docking approaches and scoring functions, the genome-wide application of many ligand docking/screening algorithms is limited by the quality of the binding sites in theoretical receptor models constructed by protein structure prediction. In this study, we describe a new template-based method for the local refinement of ligand-binding regions in protein models using remotely related templates identified by threading. We designed a Support Vector Regression (SVR) model that selects correct binding site geometries in a large ensemble of multiple receptor conformations. The SVR model employs several scoring functions that impose geometrical restraints on the Cα positions, account for the specific chemical environment within a binding site and optimize the interactions with putative ligands. The SVR score is well correlated with the RMSD from the native structure; in 47% (70%) of the cases, the Pearson's correlation coefficient is >0.5 (>0.3). When applied to weakly homologous models, the average heavy atom, local RMSD from the native structure of the top-ranked (best of top five) binding site geometries is 3.1Å (2.9Å) for roughly half of the targets; this represents a 0.1 (0.3)Å average improvement over the original predicted structure. Focusing on the subset of strongly conserved residues, the average heavy atom RMSD is 2.6Å (2.3Å). Furthermore, we estimate the upper bound of template-based binding site refinement using only weakly related proteins to be ∼2.6Å RMSD. This value also corresponds to the plasticity of the ligand-binding regions in distant homologues. The Binding Site Refinement (BSR) approach is available to the scientific community as a web server that can be accessed at http://cssb.biology.gatech.edu/bsr/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, GA 30318, USA
| | | | | | | |
Collapse
|
26
|
Huang SY, Zou X. Advances and challenges in protein-ligand docking. Int J Mol Sci 2010; 11:3016-34. [PMID: 21152288 PMCID: PMC2996748 DOI: 10.3390/ijms11083016] [Citation(s) in RCA: 298] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2010] [Revised: 08/09/2010] [Accepted: 08/10/2010] [Indexed: 02/04/2023] Open
Abstract
Molecular docking is a widely-used computational tool for the study of molecular recognition, which aims to predict the binding mode and binding affinity of a complex formed by two or more constituent molecules with known structures. An important type of molecular docking is protein-ligand docking because of its therapeutic applications in modern structure-based drug design. Here, we review the recent advances of protein flexibility, ligand sampling, and scoring functions—the three important aspects in protein-ligand docking. Challenges and possible future directions are discussed in the Conclusion.
Collapse
Affiliation(s)
- Sheng-You Huang
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211, USA;
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211, USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211, USA;
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211, USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
- *Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-573-882-6045; Fax: +1-573-884-4232
| |
Collapse
|
27
|
Vorobjev YN. Blind docking method combining search of low-resolution binding sites with ligand pose refinement by molecular dynamics-based global optimization. J Comput Chem 2010; 31:1080-92. [PMID: 19821514 DOI: 10.1002/jcc.21394] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
This study describes the development of a new blind hierarchical docking method, bhDock, its implementation, and accuracy assessment. The bhDock method uses two-step algorithm. First, a comprehensive set of low-resolution binding sites is determined by analyzing entire protein surface and ranked by a simple score function. Second, ligand position is determined via a molecular dynamics-based method of global optimization starting from a small set of high ranked low-resolution binding sites. The refinement of the ligand binding pose starts from uniformly distributed multiple initial ligand orientations and uses simulated annealing molecular dynamics coupled with guided force-field deformation of protein-ligand interactions to find the global minimum. Assessment of the bhDock method on the set of 37 protein-ligand complexes has shown the success rate of predictions of 78%, which is better than the rate reported for the most cited docking methods, such as AutoDock, DOCK, GOLD, and FlexX, on the same set of complexes.
Collapse
Affiliation(s)
- Yury N Vorobjev
- Institute of Chemical Biology and Fundamental Medicine of the Siberian Branch of the Russian Academy of Science, Novosibirsk, Russia.
| |
Collapse
|
28
|
Pierri CL, Parisi G, Porcelli V. Computational approaches for protein function prediction: a combined strategy from multiple sequence alignment to molecular docking-based virtual screening. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2010; 1804:1695-712. [PMID: 20433957 DOI: 10.1016/j.bbapap.2010.04.008] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Revised: 03/04/2010] [Accepted: 04/14/2010] [Indexed: 12/12/2022]
Abstract
The functional characterization of proteins represents a daily challenge for biochemical, medical and computational sciences. Although finally proved on the bench, the function of a protein can be successfully predicted by computational approaches that drive the further experimental assays. Current methods for comparative modeling allow the construction of accurate 3D models for proteins of unknown structure, provided that a crystal structure of a homologous protein is available. Binding regions can be proposed by using binding site predictors, data inferred from homologous crystal structures, and data provided from a careful interpretation of the multiple sequence alignment of the investigated protein and its homologs. Once the location of a binding site has been proposed, chemical ligands that have a high likelihood of binding can be identified by using ligand docking and structure-based virtual screening of chemical libraries. Most docking algorithms allow building a list sorted by energy of the lowest energy docking configuration for each ligand of the library. In this review the state-of-the-art of computational approaches in 3D protein comparative modeling and in the study of protein-ligand interactions is provided. Furthermore a possible combined/concerted multistep strategy for protein function prediction, based on multiple sequence alignment, comparative modeling, binding region prediction, and structure-based virtual screening of chemical libraries, is described by using suitable examples. As practical examples, Abl-kinase molecular modeling studies, HPV-E6 protein multiple sequence alignment analysis, and some other model docking-based characterization reports are briefly described to highlight the importance of computational approaches in protein function prediction.
Collapse
Affiliation(s)
- Ciro Leonardo Pierri
- Department of Pharmaco-Biology, Laboratory of Biochemistry and Molecular Biology, University of Bari, Va E. Orabona, 4 - 70125 Bari, Italy.
| | | | | |
Collapse
|
29
|
Abstract
The success of ligand docking calculations typically depends on the quality of the receptor structure. Given improvements in protein structure prediction approaches, approximate protein models now can be routinely obtained for the majority of gene products in a given proteome. Structure-based virtual screening of large combinatorial libraries of lead candidates against theoretically modeled receptor structures requires fast and reliable docking techniques capable of dealing with structural inaccuracies in protein models. Here, we present Q-Dock(LHM), a method for low-resolution refinement of binding poses provided by FINDSITE(LHM), a ligand homology modeling approach. We compare its performance to that of classical ligand docking approaches in ligand docking against a representative set of experimental (both holo and apo) as well as theoretically modeled receptor structures. Docking benchmarks reveal that unlike all-atom docking, Q-Dock(LHM) exhibits the desired tolerance to the receptor's structure deformation. Our results suggest that the use of an evolution-based approach to ligand homology modeling followed by fast low-resolution refinement is capable of achieving satisfactory performance in ligand-binding pose prediction with promising applicability to proteome-scale applications.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| |
Collapse
|
30
|
Kundrotas PJ, Vakser IA. Accuracy of protein-protein binding sites in high-throughput template-based modeling. PLoS Comput Biol 2010; 6:e1000727. [PMID: 20369011 PMCID: PMC2848539 DOI: 10.1371/journal.pcbi.1000727] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Accepted: 03/01/2010] [Indexed: 11/18/2022] Open
Abstract
The accuracy of protein structures, particularly their binding sites, is essential for the success of modeling protein complexes. Computationally inexpensive methodology is required for genome-wide modeling of such structures. For systematic evaluation of potential accuracy in high-throughput modeling of binding sites, a statistical analysis of target-template sequence alignments was performed for a representative set of protein complexes. For most of the complexes, alignments containing all residues of the interface were found. The full interface alignments were obtained even in the case of poor alignments where a relatively small part of the target sequence (as low as 40%) aligned to the template sequence, with a low overall alignment identity (<30%). Although such poor overall alignments might be considered inadequate for modeling of whole proteins, the alignment of the interfaces was strong enough for docking. In the set of homology models built on these alignments, one third of those ranked 1 by a simple sequence identity criteria had RMSD<5 Å, the accuracy suitable for low-resolution template free docking. Such models corresponded to multi-domain target proteins, whereas for single-domain proteins the best models had 5 Å<RMSD<10 Å, the accuracy suitable for less sensitive structure-alignment methods. Overall, ∼50% of complexes with the interfaces modeled by high-throughput techniques had accuracy suitable for meaningful docking experiments. This percentage will grow with the increasing availability of co-crystallized protein-protein complexes. Protein-protein interactions play a central role in life processes at the molecular level. The structural information on these interactions is essential for our understanding of these processes and our ability to design drugs to cure diseases. Limitations of experimental techniques to determine the structure of protein-protein complexes leave the vast majority of these complexes to be determined by computational modeling. The modeling is also important for revealing the mechanisms of the complex formation. The 3D modeling of protein complexes (protein docking) relies on the structure of the individual proteins for the prediction of their assembly. Thus the structural accuracy of the individual proteins, which often are models themselves, is critical for the docking. For the docking purposes, the accuracy of the binding sites is obviously essential, whereas the accuracy of the non-binding regions is less critical. In our study, we systematically analyze the accuracy of the binding sites in protein models produced by high-throughput techniques suitable for large-scale (e.g., genome-wide) studies. The results indicate that this accuracy is adequate for the low- to medium-resolution docking of a significant part of known protein-protein complexes.
Collapse
Affiliation(s)
- Petras J. Kundrotas
- Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ilya A. Vakser
- Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: .
| |
Collapse
|
31
|
Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 2010; 5:725-38. [PMID: 20360767 PMCID: PMC2849174 DOI: 10.1038/nprot.2010.5] [Citation(s) in RCA: 4827] [Impact Index Per Article: 344.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Collapse
Affiliation(s)
- Ambrish Roy
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Alper Kucukural
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| |
Collapse
|
32
|
Zobnina V, Roterman I. Application of the fuzzy-oil-drop model to membrane protein simulation. Proteins 2009; 77:378-94. [PMID: 19455711 DOI: 10.1002/prot.22443] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The analysis of structural properties and biological activity of membrane proteins requires long lasting simulation of molecular dynamics. The large number of atoms present in protein molecule, membrane (phospholipids), and water environment makes the simulation of large scale. The implementation of simplified model representing the natural environment for membrane proteins is presented and compared with the vacuum simulation and simulation in the presence of water molecules and membrane phospholipids presented explicite. The comparative structural analysis and computational times for these three models makes the simplified model promising.
Collapse
Affiliation(s)
- Veronica Zobnina
- Department of Bioinformatics and Telemedicine, Collegium Medicum-Jagiellonian University, Krakow, Poland
| | | |
Collapse
|
33
|
Brylinski M, Skolnick J. FINDSITE: a threading-based approach to ligand homology modeling. PLoS Comput Biol 2009; 5:e1000405. [PMID: 19503616 PMCID: PMC2685473 DOI: 10.1371/journal.pcbi.1000405] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2009] [Accepted: 05/05/2009] [Indexed: 11/19/2022] Open
Abstract
Ligand virtual screening is a widely used tool to assist in new pharmaceutical discovery. In practice, virtual screening approaches have a number of limitations, and the development of new methodologies is required. Previously, we showed that remotely related proteins identified by threading often share a common binding site occupied by chemically similar ligands. Here, we demonstrate that across an evolutionarily related, but distant family of proteins, the ligands that bind to the common binding site contain a set of strongly conserved anchor functional groups as well as a variable region that accounts for their binding specificity. Furthermore, the sequence and structure conservation of residues contacting the anchor functional groups is significantly higher than those contacting ligand variable regions. Exploiting these insights, we developed FINDSITE(LHM) that employs structural information extracted from weakly related proteins to perform rapid ligand docking by homology modeling. In large scale benchmarking, using the predicted anchor-binding mode and the crystal structure of the receptor, FINDSITE(LHM) outperforms classical docking approaches with an average ligand RMSD from native of approximately 2.5 A. For weakly homologous receptor protein models, using FINDSITE(LHM), the fraction of recovered binding residues and specific contacts is 0.66 (0.55) and 0.49 (0.38) for highly confident (all) targets, respectively. Finally, in virtual screening for HIV-1 protease inhibitors, using similarity to the ligand anchor region yields significantly improved enrichment factors. Thus, the rather accurate, computationally inexpensive FINDSITE(LHM) algorithm should be a useful approach to assist in the discovery of novel biopharmaceuticals.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia
Institute of Technology, Atlanta, Georgia, United States of America
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia
Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
34
|
Zhang Y. Protein structure prediction: when is it useful? Curr Opin Struct Biol 2009; 19:145-55. [PMID: 19327982 PMCID: PMC2673339 DOI: 10.1016/j.sbi.2009.02.005] [Citation(s) in RCA: 193] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2008] [Revised: 02/18/2009] [Accepted: 02/19/2009] [Indexed: 10/21/2022]
Abstract
Computationally predicted three-dimensional structure of protein molecules has demonstrated the usefulness in many areas of biomedicine, ranging from approximate family assignments to precise drug screening. For nearly 40 years, however, the accuracy of the predicted models has been dictated by the availability of close structural templates. Progress has recently been achieved in refining low-resolution models closer to the native ones; this has been made possible by combining knowledge-based information from multiple sources of structural templates as well as by improving the energy funnel of physics-based force fields. Unfortunately, there has been no essential progress in the development of techniques for detecting remotely homologous templates and for predicting novel protein structures.
Collapse
Affiliation(s)
- Yang Zhang
- Center for Bioinformatics and Department of Molecular Biosciences, University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA.
| |
Collapse
|
35
|
Skolnick J, Brylinski M. FINDSITE: a combined evolution/structure-based approach to protein function prediction. Brief Bioinform 2009; 10:378-91. [PMID: 19324930 DOI: 10.1093/bib/bbp017] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A key challenge of the post-genomic era is the identification of the function(s) of all the molecules in a given organism. Here, we review the status of sequence and structure-based approaches to protein function inference and ligand screening that can provide functional insights for a significant fraction of the approximately 50% of ORFs of unassigned function in an average proteome. We then describe FINDSITE, a recently developed algorithm for ligand binding site prediction, ligand screening and molecular function prediction, which is based on binding site conservation across evolutionary distant proteins identified by threading. Importantly, FINDSITE gives comparable results when high-resolution experimental structures as well as predicted protein models are used.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology 250 14th St NW, Atlanta, GA 30318, USA.
| | | |
Collapse
|