1
|
Adiyaman R, McGuffin LJ. Using Local Protein Model Quality Estimates to Guide a Molecular Dynamics-Based Refinement Strategy. Methods Mol Biol 2023; 2627:119-140. [PMID: 36959445 DOI: 10.1007/978-1-0716-2974-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The refinement of predicted 3D models aims to bring them closer to the native structure by fixing errors including unusual bonds and torsion angles and irregular hydrogen bonding patterns. Refinement approaches based on molecular dynamics (MD) simulations using different types of restraints have performed well since CASP10. ReFOLD, developed by the McGuffin group, was one of the many MD-based refinement approaches, which were tested in CASP 12. When the performance of the ReFOLD method in CASP12 was evaluated, it was observed that ReFOLD suffered from the absence of a reliable guidance mechanism to reach consistent improvement for the quality of predicted 3D models, particularly in the case of template-based modelling (TBM) targets. Therefore, here we propose to utilize the local quality assessment score produced by ModFOLD6 to guide the MD-based refinement approach to further increase the accuracy of the predicted 3D models. The relative performance of the new local quality assessment guided MD-based refinement protocol and the original MD-based protocol ReFOLD are compared utilizing many different official scoring methods. By using the per-residue accuracy (or local quality) score to guide the refinement process, we are able to prevent the refined models from undesired structural deviations, thereby leading to more consistent improvements. This chapter will include a detailed analysis of the performance of the local quality assessment guided MD-based protocol versus that deployed in the original ReFOLD method.
Collapse
Affiliation(s)
- Recep Adiyaman
- School of Biological Sciences, University of Reading, Reading, UK
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
2
|
Simpkin AJ, Rodríguez FS, Mesdaghi S, Kryshtafovych A, Rigden DJ. Evaluation of model refinement in CASP14. Proteins 2021; 89:1852-1869. [PMID: 34288138 PMCID: PMC8616799 DOI: 10.1002/prot.26185] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 07/11/2021] [Indexed: 12/15/2022]
Abstract
We report here an assessment of the model refinement category of the 14th round of Critical Assessment of Structure Prediction (CASP14). As before, predictors submitted up to five ranked refinements, along with associated residue-level error estimates, for targets that had a wide range of starting quality. The ability of groups to accurately rank their submissions and to predict coordinate error varied widely. Overall, only four groups out-performed a "naïve predictor" corresponding to the resubmission of the starting model. Among the top groups, there are interesting differences of approach and in the spread of improvements seen: some methods are more conservative, others more adventurous. Some targets were "double-barreled" for which predictors were offered a high-quality AlphaFold 2 (AF2)-derived prediction alongside another of lower quality. The AF2-derived models were largely unimprovable, many of their apparent errors being found to reside at domain and, especially, crystal lattice contacts. Refinement is shown to have a mixed impact overall on structure-based function annotation methods to predict nucleic acid binding, spot catalytic sites, and dock protein structures.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
- Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0DE, England
| | - Shahram Mesdaghi
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | | | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
3
|
Shuvo MH, Gulfam M, Bhattacharya D. DeepRefiner: high-accuracy protein structure refinement by deep network calibration. Nucleic Acids Res 2021; 49:W147-W152. [PMID: 33999209 PMCID: PMC8262753 DOI: 10.1093/nar/gkab361] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 04/18/2021] [Accepted: 04/23/2021] [Indexed: 12/20/2022] Open
Abstract
The DeepRefiner webserver, freely available at http://watson.cse.eng.auburn.edu/DeepRefiner/, is an interactive and fully configurable online system for high-accuracy protein structure refinement. Fuelled by deep learning, DeepRefiner offers the ability to leverage cutting-edge deep neural network architectures which can be calibrated for on-demand selection of adventurous or conservative refinement modes targeted at degree or consistency of refinement. The method has been extensively tested in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments under the group name 'Bhattacharya-Server' and was officially ranked as the No. 2 refinement server in CASP13 (second only to 'Seok-server' and outperforming all other refinement servers) and No. 2 refinement server in CASP14 (second only to 'FEIG-S' and outperforming all other refinement servers including 'Seok-server'). The DeepRefiner web interface offers a number of convenient features, including (i) fully customizable refinement job submission and validation; (ii) automated job status update, tracking, and notifications; (ii) interactive and interpretable web-based results retrieval with quantitative and visual analysis and (iv) extensive help information on job submission and results interpretation via web-based tutorial and help tooltips.
Collapse
Affiliation(s)
- Md Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
| | - Muhammad Gulfam
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
- Department of Biological Sciences, Auburn University, Auburn, AL 36849, USA
| |
Collapse
|
4
|
Heo L, Park S, Seok C. GalaxyWater-wKGB: Prediction of Water Positions on Protein Structure Using wKGB Statistical Potential. J Chem Inf Model 2021; 61:2283-2293. [PMID: 33938216 DOI: 10.1021/acs.jcim.0c01434] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Proteins fold and function in water, and protein-water interactions play important roles in protein structure and function. In computational studies on protein structure and interaction, the effect of water is considered either implicitly or explicitly. Implicit water models are frequently used in protein structure prediction and docking because they are computationally much more efficient than explicit water models, which are often employed in molecular dynamics (MD) simulations. However, implicit water models that treat water as a continuous solvent medium cannot account for specific atomistic protein-water interactions that are critical for structure formation and interactions with other molecules. Various methods for predicting water molecules that form specific atomistic interactions with proteins have been developed. Methods involving MD simulations or the integral equation theory tend to produce more accurate results at a higher computational cost than simple geometry- or energy-based methods. Here, we present a novel method for predicting water positions on a protein surface called GalaxyWater-wKGB, which is based on a statistical potential, a water knowledge-based potential based on the generalized Born model (wKGB). This method is accurate and rapid because it does not require conformational sampling or iterative computation owing to the effective statistical treatment employed to derive the potential. The statistical potential describes specific protein atom-water interactions more accurately than conventional potentials by considering the dependence on the degree of solvent accessibility of protein atoms as well as on protein atom-water distances and orientations. The introduction of solvent accessibility allows effective consideration of competing nonspecific protein-water and intraprotein interactions. When tested on high-resolution protein crystal structures, this method could recover similar or larger fractions of crystallographic water 180 times faster than the sophisticated integral equation theory, 3D-RISM. A web service of this water prediction method is freely available at http://galaxy.seoklab.org/wkgb.
Collapse
Affiliation(s)
- Lim Heo
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Sangwoo Park
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|
5
|
Kapla J, Rodríguez-Espigares I, Ballante F, Selent J, Carlsson J. Can molecular dynamics simulations improve the structural accuracy and virtual screening performance of GPCR models? PLoS Comput Biol 2021; 17:e1008936. [PMID: 33983933 PMCID: PMC8186765 DOI: 10.1371/journal.pcbi.1008936] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 06/08/2021] [Accepted: 04/02/2021] [Indexed: 01/14/2023] Open
Abstract
The determination of G protein-coupled receptor (GPCR) structures at atomic resolution has improved understanding of cellular signaling and will accelerate the development of new drug candidates. However, experimental structures still remain unavailable for a majority of the GPCR family. GPCR structures and their interactions with ligands can also be modelled computationally, but such predictions have limited accuracy. In this work, we explored if molecular dynamics (MD) simulations could be used to refine the accuracy of in silico models of receptor-ligand complexes that were submitted to a community-wide assessment of GPCR structure prediction (GPCR Dock). Two simulation protocols were used to refine 30 models of the D3 dopamine receptor (D3R) in complex with an antagonist. Close to 60 μs of simulation time was generated and the resulting MD refined models were compared to a D3R crystal structure. In the MD simulations, the receptor models generally drifted further away from the crystal structure conformation. However, MD refinement was able to improve the accuracy of the ligand binding mode. The best refinement protocol improved agreement with the experimentally observed ligand binding mode for a majority of the models. Receptor structures with improved virtual screening performance, which was assessed by molecular docking of ligands and decoys, could also be identified among the MD refined models. Application of weak restraints to the transmembrane helixes in the MD simulations further improved predictions of the ligand binding mode and second extracellular loop. These results provide guidelines for application of MD refinement in prediction of GPCR-ligand complexes and directions for further method development.
Collapse
Affiliation(s)
- Jon Kapla
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Ismael Rodríguez-Espigares
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences of Pompeu Fabra University (UPF), Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain
| | - Flavio Ballante
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Jana Selent
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences of Pompeu Fabra University (UPF), Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain
| | - Jens Carlsson
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
6
|
de Araújo RSA, Mendonça FJ, Scotti MT, Scotti L. Protein modeling. PHYSICAL SCIENCES REVIEWS 2021. [DOI: 10.1515/psr-2018-0161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Proteins are essential and versatile polymers consisting of sequenced amino acids that often possess an organized three-dimensional arrangement, (a result of their monomeric composition), which determines their biological role in cellular function. Proteins are involved in enzymatic catalysis; they participate in genetic information decoding and transmission processes, in cell recognition, in signaling, and transport of substances, in regulation of intra and extracellular conditions, and other functions.
Collapse
Affiliation(s)
- Rodrigo S. A. de Araújo
- Biological Science Department, Laboratory of Synthesis and Drug Delivery , State University of Paraiba , 58070-450 , João Pessoa , PB , Brazil
| | - Francisco J. B. Mendonça
- Biological Science Department, Laboratory of Synthesis and Drug Delivery , State University of Paraiba , 58070-450 , João Pessoa , PB , Brazil
| | - Marcus T. Scotti
- Health Center , Federal University of Paraíba , 50670-910 , João Pessoa , PB , Brazil
| | - Luciana Scotti
- Health Center , Federal University of Paraíba , 50670-910 , João Pessoa , PB , Brazil
| |
Collapse
|
7
|
Heo L, Arbour CF, Janson G, Feig M. Improved Sampling Strategies for Protein Model Refinement Based on Molecular Dynamics Simulation. J Chem Theory Comput 2021; 17:1931-1943. [PMID: 33562962 DOI: 10.1021/acs.jctc.0c01238] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. These methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore the conformational space more broadly. Based on the insights of this analysis, we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Collin F Arbour
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
8
|
Borbulevych OY, Martin RI, Westerhoff LM. The critical role of QM/MM X-ray refinement and accurate tautomer/protomer determination in structure-based drug design. J Comput Aided Mol Des 2020; 35:433-451. [PMID: 33108589 PMCID: PMC8018927 DOI: 10.1007/s10822-020-00354-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 10/12/2020] [Indexed: 12/29/2022]
Abstract
Conventional protein:ligand crystallographic refinement uses stereochemistry restraints coupled with a rudimentary energy functional to ensure the correct geometry of the model of the macromolecule—along with any bound ligand(s)—within the context of the experimental, X-ray density. These methods generally lack explicit terms for electrostatics, polarization, dispersion, hydrogen bonds, and other key interactions, and instead they use pre-determined parameters (e.g. bond lengths, angles, and torsions) to drive structural refinement. In order to address this deficiency and obtain a more complete and ultimately more accurate structure, we have developed an automated approach for macromolecular refinement based on a two layer, QM/MM (ONIOM) scheme as implemented within our DivCon Discovery Suite and "plugged in" to two mainstream crystallographic packages: PHENIX and BUSTER. This implementation is able to use one or more region layer(s), which is(are) characterized using linear-scaling, semi-empirical quantum mechanics, followed by a system layer which includes the balance of the model and which is described using a molecular mechanics functional. In this work, we applied our Phenix/DivCon refinement method—coupled with our XModeScore method for experimental tautomer/protomer state determination—to the characterization of structure sets relevant to structure-based drug design (SBDD). We then use these newly refined structures to show the impact of QM/MM X-ray refined structure on our understanding of function by exploring the influence of these improved structures on protein:ligand binding affinity prediction (and we likewise show how we use post-refinement scoring outliers to inform subsequent X-ray crystallographic efforts). Through this endeavor, we demonstrate a computational chemistry ↔ structural biology (X-ray crystallography) "feedback loop" which has utility in industrial and academic pharmaceutical research as well as other allied fields.
Collapse
Affiliation(s)
- Oleg Y Borbulevych
- QuantumBio Inc, 2790 West College Ave, Suite 900, State College, PA, 16801, USA
| | - Roger I Martin
- QuantumBio Inc, 2790 West College Ave, Suite 900, State College, PA, 16801, USA
| | - Lance M Westerhoff
- QuantumBio Inc, 2790 West College Ave, Suite 900, State College, PA, 16801, USA.
| |
Collapse
|
9
|
Ghafouri F, Cohan RA, Noorbakhsh F, Samimi H, Haghpanah V. An in-silico approach to develop of a multi-epitope vaccine candidate against SARS-CoV-2 envelope (E) protein. RESEARCH SQUARE 2020. [PMID: 32702713 PMCID: PMC7336711 DOI: 10.21203/rs.3.rs-30374/v1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Since the first appearance of the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS- CoV-2) in China on December 2019, the world has now witnessed the emergence of the SARS- CoV-2 outbreak. Therefore, due to the high transmissibility rate of virus, there is an urgent need to design and develop vaccines against SARS-CoV-2 to prevent more cases affected by the virus. In this study, a computational approach is proposed for vaccine design against the envelope (E) protein of SARS-CoV-2, which contains a conserved sequence feature. First, we sought to gain potential B-cell and T-cell epitopes for vaccine designing against SARS-CoV-2. Second, we attempted to develop a multi-epitope vaccine. Immune targeting of such epitopes could theoretically provide defense against SARS-CoV-2. Finally, we evaluated the affinity of the vaccine to major histocompatibility complex (MHC) molecules to stimulate the immune system response to this vaccine. We also identified a collection of B-cell and T-cell epitopes derived from E proteins that correspond identically to SARS-CoV-2 E proteins. The in-silico design of our potential vaccine against E protein of SARS-CoV-2 demonstrated a high affinity to MHC molecules, and it can be a candidate to make a protection against this pandemic event.
Collapse
|
10
|
Lee GR, Won J, Heo L, Seok C. GalaxyRefine2: simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res 2020; 47:W451-W455. [PMID: 31001635 PMCID: PMC6602442 DOI: 10.1093/nar/gkz288] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 04/01/2019] [Accepted: 04/11/2019] [Indexed: 11/12/2022] Open
Abstract
The 3D structure of a protein can be predicted from its amino acid sequence with high accuracy for a large fraction of cases because of the availability of large quantities of experimental data and the advance of computational algorithms. Recently, deep learning methods exploiting the coevolution information obtained by comparing related protein sequences have been successfully used to generate highly accurate model structures even in the absence of template structure information. However, structures predicted based on either template structures or related sequences require further improvement in regions for which information is missing. Refining a predicted protein structure with insufficient information on certain regions is critical because these regions may be connected to functional specificity that is not conserved among related proteins. The GalaxyRefine2 web server, freely available via http://galaxy.seoklab.org/refine2, is an upgraded version of the GalaxyRefine protein structure refinement server and reflects recent developments successfully tested through CASP blind prediction experiments. This method adopts an iterative optimization approach involving various structure move sets to refine both local and global structures. The estimation of local error and hybridization of available homolog structures are also employed for effective conformation search.
Collapse
Affiliation(s)
- Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
11
|
Alapati R, Shuvo MH, Bhattacharya D. SPECS: Integration of side-chain orientation and global distance-based measures for improved evaluation of protein structural models. PLoS One 2020; 15:e0228245. [PMID: 32053611 PMCID: PMC7018003 DOI: 10.1371/journal.pone.0228245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 01/11/2020] [Indexed: 12/23/2022] Open
Abstract
Significant advancements in the field of protein structure prediction have necessitated the need for objective and robust evaluation of protein structural models by comparing predicted models against the experimentally determined native structures to quantitate their structural similarities. Existing protein model versus native similarity metrics either consider the distances between alpha carbon (Cα) or side-chain atoms for computing the similarity. However, side-chain orientation of a protein plays a critical role in defining its conformation at the atomic-level. Despite its importance, inclusion of side-chain orientation in structural similarity evaluation has not yet been addressed. Here, we present SPECS, a side-chain-orientation-included protein model-native similarity metric for improved evaluation of protein structural models. SPECS combines side-chain orientation and global distance based measures in an integrated framework using the united-residue model of polypeptide conformation for computing model-native similarity. Experimental results demonstrate that SPECS is a reliable measure for evaluating structural similarity at the global level including and beyond the accuracy of Cα positioning. Moreover, SPECS delivers superior performance in capturing local quality aspect compared to popular global Cα positioning-based metrics ranging from models at near-experimental accuracies to models with correct overall folds-making it a robust measure suitable for both high- and moderate-resolution models. Finally, SPECS is sensitive to minute variations in side-chain χ angles even for models with perfect Cα trace, revealing the power of including side-chain orientation. Collectively, SPECS is a versatile evaluation metric covering a wide spectrum of protein modeling scenarios and simultaneously captures complementary aspects of structural similarities at multiple levels of granularities. SPECS is freely available at http://watson.cse.eng.auburn.edu/SPECS/.
Collapse
Affiliation(s)
- Rahul Alapati
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Md. Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
- Department of Biological Sciences, Auburn University, Auburn, Alabama, United States of America
| |
Collapse
|
12
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIII. Proteins 2019; 87:1011-1020. [PMID: 31589781 DOI: 10.1002/prot.25823] [Citation(s) in RCA: 269] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 09/25/2019] [Accepted: 09/27/2019] [Indexed: 12/24/2022]
Abstract
CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically "ab initio" modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas-model refinement, accuracy estimation, and the structure of protein assemblies-have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.
Collapse
Affiliation(s)
| | - Torsten Schwede
- Biozentrum & SIB Swiss Institute of Bioinformatics, University of Basel, Basel, Switzerland
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, UK
| | | | - John Moult
- Institute for Bioscience and Biotechnology Research, Rockville, Maryland.,Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| |
Collapse
|
13
|
Read RJ, Sammito MD, Kryshtafovych A, Croll TI. Evaluation of model refinement in CASP13. Proteins 2019; 87:1249-1262. [PMID: 31365160 PMCID: PMC6851427 DOI: 10.1002/prot.25794] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/03/2019] [Accepted: 07/27/2019] [Indexed: 12/25/2022]
Abstract
Performance in the model refinement category of the 13th round of Critical Assessment of Structure Prediction (CASP13) is assessed, showing that some groups consistently improve most starting models whereas the majority of participants continue to degrade the starting model on average. Using the ranking formula developed for CASP12, it is shown that only 7 of 32 groups perform better than a “naïve predictor” who just submits the starting model. Common features in their approaches include a dependence on physics‐based force fields to judge alternative conformations and the use of molecular dynamics to relax models to local minima, usually with some restraints to prevent excessively large movements. In addition to the traditional CASP metrics that focus largely on the quality of the overall fold, alternative metrics are evaluated, including comparisons of the main‐chain and side‐chain torsion angles, and the utility of the models for solving crystal structures by the molecular replacement method. It is proposed that the introduction of these metrics, as well as consideration of the accuracy of coordinate error estimates, would improve the discrimination between good and very good models.
Collapse
Affiliation(s)
- Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Massimo D Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| |
Collapse
|
14
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
15
|
Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci U S A 2018; 115:13276-13281. [PMID: 30530696 DOI: 10.1073/pnas.1811364115] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Refinement is the last step in protein structure prediction pipelines to convert approximate homology models to experimental accuracy. Protocols based on molecular dynamics (MD) simulations have shown promise, but current methods are limited to moderate levels of consistent refinement. To explore the energy landscape between homology models and native structures and analyze the challenges of MD-based refinement, eight test cases were studied via extensive simulations followed by Markov state modeling. In all cases, native states were found very close to the experimental structures and at the lowest free energies, but refinement was hindered by a rough energy landscape. Transitions from the homology model to the native states require the crossing of significant kinetic barriers on at least microsecond time scales. A significant energetic driving force toward the native state was lacking until its immediate vicinity, and there was significant sampling of off-pathway states competing for productive refinement. The role of recent force field improvements is discussed and transition paths are analyzed in detail to inform which key transitions have to be overcome to achieve successful refinement.
Collapse
|
16
|
Borbulevych O, Martin RI, Westerhoff LM. High-throughput quantum-mechanics/molecular-mechanics (ONIOM) macromolecular crystallographic refinement with PHENIX/DivCon: the impact of mixed Hamiltonian methods on ligand and protein structure. Acta Crystallogr D Struct Biol 2018; 74:1063-1077. [PMID: 30387765 PMCID: PMC6213575 DOI: 10.1107/s2059798318012913] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Accepted: 09/12/2018] [Indexed: 12/28/2022] Open
Abstract
Conventional macromolecular crystallographic refinement relies on often dubious stereochemical restraints, the preparation of which often requires human validation for unusual species, and on rudimentary energy functionals that are devoid of nonbonding effects owing to electrostatics, polarization, charge transfer or even hydrogen bonding. While this approach has served the crystallographic community for decades, as structure-based drug design/discovery (SBDD) has grown in prominence it has become clear that these conventional methods are less rigorous than they need to be in order to produce properly predictive protein-ligand models, and that the human intervention that is required to successfully treat ligands and other unusual chemistries found in SBDD often precludes high-throughput, automated refinement. Recently, plugins to the Python-based Hierarchical ENvironment for Integrated Xtallography (PHENIX) crystallographic platform have been developed to augment conventional methods with the in situ use of quantum mechanics (QM) applied to ligand(s) along with the surrounding active site(s) at each step of refinement [Borbulevych et al. (2014), Acta Cryst D70, 1233-1247]. This method (Region-QM) significantly increases the accuracy of the X-ray refinement process, and this approach is now used, coupled with experimental density, to accurately determine protonation states, binding modes, ring-flip states, water positions and so on. In the present work, this approach is expanded to include a more rigorous treatment of the entire structure, including the ligand(s), the associated active site(s) and the entire protein, using a fully automated, mixed quantum-mechanics/molecular-mechanics (QM/MM) Hamiltonian recently implemented in the DivCon package. This approach was validated through the automatic treatment of a population of 80 protein-ligand structures chosen from the Astex Diverse Set. Across the entire population, this method results in an average 3.5-fold reduction in ligand strain and a 4.5-fold improvement in MolProbity clashscore, as well as improvements in Ramachandran and rotamer outlier analyses. Overall, these results demonstrate that the use of a structure-wide QM/MM Hamiltonian exhibits improvements in the local structural chemistry of the ligand similar to Region-QM refinement but with significant improvements in the overall structure beyond the active site.
Collapse
Affiliation(s)
- Oleg Borbulevych
- QuantumBio Inc., 2790 West College Avenue, State College, PA 16801, USA
| | - Roger I. Martin
- QuantumBio Inc., 2790 West College Avenue, State College, PA 16801, USA
| | | |
Collapse
|
17
|
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 2018; 86 Suppl 1:7-15. [PMID: 29082672 PMCID: PMC5897042 DOI: 10.1002/prot.25415] [Citation(s) in RCA: 245] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 10/25/2017] [Accepted: 10/27/2017] [Indexed: 12/24/2022]
Abstract
This article reports the outcome of the 12th round of Critical Assessment of Structure Prediction (CASP12), held in 2016. CASP is a community experiment to determine the state of the art in modeling protein structure from amino acid sequence. Participants are provided sequence information and in turn provide protein structure models and related information. Analysis of the submitted structures by independent assessors provides a comprehensive picture of the capabilities of current methods, and allows progress to be identified. This was again an exciting round of CASP, with significant advances in 4 areas: (i) The use of new methods for predicting three-dimensional contacts led to a two-fold improvement in contact accuracy. (ii) As a consequence, model accuracy for proteins where no template was available improved dramatically. (iii) Models based on a structural template showed overall improvement in accuracy. (iv) Methods for estimating the accuracy of a model continued to improve. CASP continued to develop new areas: (i) Assessing methods for building quaternary structure models, including an expansion of the collaboration between CASP and CAPRI. (ii) Modeling with the aid of experimental data was extended to include SAXS data, as well as again using chemical cross-linking information. (iii) A team of assessors evaluated the suitability of models for a range of applications, including mutation interpretation, analysis of ligand binding properties, and identification of interfaces. This article describes the experiment and summarizes the results. The rest of this special issue of PROTEINS contains papers describing CASP12 results and assessments in more detail.
Collapse
Affiliation(s)
- John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics and Istituto Pasteur - Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le Aldo Moro, 5, 00185 Rome, Italy
| |
Collapse
|
18
|
Hovan L, Oleinikovas V, Yalinca H, Kryshtafovych A, Saladino G, Gervasio FL. Assessment of the model refinement category in CASP12. Proteins 2017; 86 Suppl 1:152-167. [PMID: 29071750 DOI: 10.1002/prot.25409] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 10/03/2017] [Accepted: 10/24/2017] [Indexed: 01/07/2023]
Abstract
We here report on the assessment of the model refinement predictions submitted to the 12th Experiment on the Critical Assessment of Protein Structure Prediction (CASP12). This is the fifth refinement experiment since CASP8 (2008) and, as with the previous experiments, the predictors were invited to refine selected server models received in the regular (nonrefinement) stage of the CASP experiment. We assessed the submitted models using a combination of standard CASP measures. The coefficients for the linear combination of Z-scores (the CASP12 score) have been obtained by a machine learning algorithm trained on the results of visual inspection. We identified eight groups that improve both the backbone conformation and the side chain positioning for the majority of targets. Albeit the top methods adopted distinctively different approaches, their overall performance was almost indistinguishable, with each of them excelling in different scores or target subsets. What is more, there were a few novel approaches that, while doing worse than average in most cases, provided the best refinements for a few targets, showing significant latitude for further innovation in the field.
Collapse
Affiliation(s)
- Ladislav Hovan
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Havva Yalinca
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Giorgio Saladino
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | - Francesco Luigi Gervasio
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom.,Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, United Kingdom
| |
Collapse
|
19
|
Cheng Q, Joung I, Lee J. A Simple and Efficient Protein Structure Refinement Method. J Chem Theory Comput 2017; 13:5146-5162. [PMID: 28800396 DOI: 10.1021/acs.jctc.7b00470] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Improving the quality of a given protein structure can serve as the ultimate solution for accurate protein structure prediction, and seeking such a method is currently a challenge in computational structural biology. In order to promote and encourage much needed such efforts, CASP (Critical Assessment of Structure Prediction) has been providing an ideal computational experimental platform, where it was reported only recently (since CASP10) that systematic protein structure refinement is possible by carrying out extensive (approximately millisecond) MD simulations with proper restraints generated from the given structure. Using an explicit solvent model and much reduced positional and distance restraints than previously exercised, we propose a refinement protocol that combines a series of short (5 ns) MD simulations with energy minimization procedures. Testing and benchmarking on 54 CASP8-10 refinement targets and 34 CASP11 refinement targets shows quite promising results. Using only a small fraction of MD simulation steps (nanosecond versus millisecond), systematic protein structure refinement was demonstrated in this work, indicating that refinement of a given model can be achieved using a few hours of desktop computing.
Collapse
Affiliation(s)
- Qianyi Cheng
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study , Seoul 02455, Korea
| | - InSuk Joung
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study , Seoul 02455, Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study , Seoul 02455, Korea
| |
Collapse
|
20
|
Feig M. Computational protein structure refinement: Almost there, yet still so far to go. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2017; 7:e1307. [PMID: 30613211 PMCID: PMC6319934 DOI: 10.1002/wcms.1307] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein structures are essential in modern biology yet experimental methods are far from being able to catch up with the rapid increase in available genomic data. Computational protein structure prediction methods aim to fill the gap while the role of protein structure refinement is to take approximate initial template-based models and bring them closer to the true native structure. Current methods for computational structure refinement rely on molecular dynamics simulations, related sampling methods, or iterative structure optimization protocols. The best methods are able to achieve moderate degrees of refinement but consistent refinement that can reach near-experimental accuracy remains elusive. Key issues revolve around the accuracy of the energy function, the inability to reliably rank multiple models, and the use of restraints that keep sampling close to the native state but also limit the degree of possible refinement. A different aspect is the question of what exactly the target of high-resolution refinement should be as experimental structures are affected by experimental conditions and different biological questions require varying levels of accuracy. While improvement of the global protein structure is a difficult problem, high-resolution refinement methods that improves local structural quality such as favorable stereochemistry and the avoidance of atomic clashes are much more successful.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, 603 Wilson Rd., Room 218 BCH, East Lansing, MI, USA, ; 517-432-7439
| |
Collapse
|
21
|
Khoury GA, Smadbeck J, Kieslich CA, Koskosidis AJ, Guzman YA, Tamamis P, Floudas CA. Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment. Proteins 2017; 85:1078-1098. [PMID: 28241391 DOI: 10.1002/prot.25274] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 02/01/2017] [Accepted: 02/14/2017] [Indexed: 12/28/2022]
Abstract
Protein structure refinement is the challenging problem of operating on any protein structure prediction to improve its accuracy with respect to the native structure in a blind fashion. Although many approaches have been developed and tested during the last four CASP experiments, a majority of the methods continue to degrade models rather than improve them. Princeton_TIGRESS (Khoury et al., Proteins 2014;82:794-814) was developed previously and utilizes separate sampling and selection stages involving Monte Carlo and molecular dynamics simulations and classification using an SVM predictor. The initial implementation was shown to consistently refine protein structures 76% of the time in our own internal benchmarking on CASP 7-10 targets. In this work, we improved the sampling and selection stages and tested the method in blind predictions during CASP11. We added a decomposition of physics-based and hybrid energy functions, as well as a coordinate-free representation of the protein structure through distance-binning Cα-Cα distances to capture fine-grained movements. We performed parameter estimation to optimize the adjustable SVM parameters to maximize precision while balancing sensitivity and specificity across all cross-validated data sets, finding enrichment in our ability to select models from the populations of similar decoys generated for targets in CASPs 7-10. The MD stage was enhanced such that larger structures could be further refined. Among refinement methods that are currently implemented as web-servers, Princeton_TIGRESS 2.0 demonstrated the most consistent and most substantial net refinement in blind predictions during CASP11. The enhanced refinement protocol Princeton_TIGRESS 2.0 is freely available as a web server at http://atlas.engr.tamu.edu/refinement/. Proteins 2017; 85:1078-1098. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - Chris A Kieslich
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Alexandra J Koskosidis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Yannis A Guzman
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey.,Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Phanourios Tamamis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Christodoulos A Floudas
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| |
Collapse
|
22
|
Modi V, Dunbrack RL. Assessment of refinement of template-based models in CASP11. Proteins 2016; 84 Suppl 1:260-81. [PMID: 27081793 DOI: 10.1002/prot.25048] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Revised: 03/13/2016] [Accepted: 04/11/2016] [Indexed: 12/26/2022]
Abstract
CASP11 (the 11th Meeting on the Critical Assessment of Protein Structure Prediction) ran a blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8. As with the previous experiments, the predictors were provided with one starting structure from the server models of each of a selected set of template-based modeling targets and asked to refine the coordinates of the starting structure toward native. We assessed the refined structures with the Z-scores of the standard CASP measures, which compare the model-target similarities of the models from all the predictors. Furthermore, we assessed the refined structures with "relative measures," which compare the improvement in accuracy of each model with respect to the starting structure. The latter provides an assessment of the extent to which each predictor group is able to improve the starting structures toward native. We utilized heat maps to display improvements in the Calpha-Calpha distance matrix for each model. The heat maps labeled with each element of secondary structure helped us to identify regions of refinement toward native in each model. Most positively scoring models show modest improvements in multiple regions of the structure, while in some models we were able to identify significant repositioning of N/C-terminal segments and internal elements of secondary structure. The best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3-5% in the standard CASP measures. Proteins 2016; 84(Suppl 1):260-281. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | |
Collapse
|
23
|
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction: Progress and new directions in round XI. Proteins 2016; 84 Suppl 1:4-14. [PMID: 27171127 DOI: 10.1002/prot.25064] [Citation(s) in RCA: 148] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Revised: 04/29/2016] [Accepted: 05/08/2016] [Indexed: 12/15/2022]
Abstract
Modeling of protein structure from amino acid sequence now plays a major role in structural biology. Here we report new developments and progress from the CASP11 community experiment, assessing the state of the art in structure modeling. Notable points include the following: (1) New methods for predicting three dimensional contacts resulted in a few spectacular template free models in this CASP, whereas models based on sequence homology to proteins with experimental structure continue to be the most accurate. (2) Refinement of initial protein models, primarily using molecular dynamics related approaches, has now advanced to the point where the best methods can consistently (though slightly) improve nearly all models. (3) The use of relatively sparse NMR constraints dramatically improves the accuracy of models, and another type of sparse data, chemical crosslinking, introduced in this CASP, also shows promise for producing better models. (4) A new emphasis on modeling protein complexes, in collaboration with CAPRI, has produced interesting results, but also shows the need for more focus on this area. (5) Methods for estimating the accuracy of models have advanced to the point where they are of considerable practical use. (6) A first assessment demonstrates that models can sometimes successfully address biological questions that motivate experimental structure determination. (7) There is continuing progress in accuracy of modeling regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other areas. Proteins 2016; 84(Suppl 1):4-14. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, Rockville, Maryland, 20850.
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, Davis, California, 95616
| | | | - Torsten Schwede
- Biozentrum & SIB Swiss Institute of Bioinformatics, University of Basel, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics and Istituto Pasteur - Fondazione Cenci Bolognetti, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
24
|
Bhattacharya D, Nowotny J, Cao R, Cheng J. 3Drefine: an interactive web server for efficient protein structure refinement. Nucleic Acids Res 2016; 44:W406-9. [PMID: 27131371 PMCID: PMC4987902 DOI: 10.1093/nar/gkw336] [Citation(s) in RCA: 281] [Impact Index Per Article: 35.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 04/15/2016] [Indexed: 11/14/2022] Open
Abstract
3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/.
Collapse
Affiliation(s)
| | - Jackson Nowotny
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Renzhi Cao
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA Informatics Institute, University of Missouri, Columbia, MO 65211, USA C. Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
25
|
Kumar A, Campitelli P, Thorpe MF, Ozkan SB. Partial unfolding and refolding for structure refinement: A unified approach of geometric simulations and molecular dynamics. Proteins 2015; 83:2279-92. [PMID: 26476100 DOI: 10.1002/prot.24947] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Revised: 09/11/2015] [Accepted: 09/29/2015] [Indexed: 12/26/2022]
Abstract
The most successful protein structure prediction methods to date have been template-based modeling (TBM) or homology modeling, which predicts protein structure based on experimental structures. These high accuracy predictions sometimes retain structural errors due to incorrect templates or a lack of accurate templates in the case of low sequence similarity, making these structures inadequate in drug-design studies or molecular dynamics simulations. We have developed a new physics based approach to the protein refinement problem by mimicking the mechanism of chaperons that rehabilitate misfolded proteins. The template structure is unfolded by selectively (targeted) pulling on different portions of the protein using the geometric based technique FRODA, and then refolded using hierarchically restrained replica exchange molecular dynamics simulations (hr-REMD). FRODA unfolding is used to create a diverse set of topologies for surveying near native-like structures from a template and to provide a set of persistent contacts to be employed during re-folding. We have tested our approach on 13 previous CASP targets and observed that this method of folding an ensemble of partially unfolded structures, through the hierarchical addition of contact restraints (that is, first local and then nonlocal interactions), leads to a refolding of the structure along with refinement in most cases (12/13). Although this approach yields refined models through advancement in sampling, the task of blind selection of the best refined models still needs to be solved. Overall, the method can be useful for improved sampling for low resolution models where certain of the portions of the structure are incorrectly modeled.
Collapse
Affiliation(s)
- Avishek Kumar
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, Arizona
| | - Paul Campitelli
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, Arizona
| | - M F Thorpe
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, Arizona.,Rudolf Peierls Center for Theoretical Physics, University of Oxford, Oxford, OX1 3NP, United Kingdom
| | - S Banu Ozkan
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, Arizona
| |
Collapse
|
26
|
Della Corte D, Wildberg A, Schröder GF. Protein structure refinement with adaptively restrained homologous replicas. Proteins 2015; 84 Suppl 1:302-13. [PMID: 26441154 DOI: 10.1002/prot.24939] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 09/02/2015] [Accepted: 09/29/2015] [Indexed: 12/27/2022]
Abstract
A novel protein refinement protocol is presented which utilizes molecular dynamics (MD) simulations of an ensemble of adaptively restrained homologous replicas. This approach adds evolutionary information to the force field and reduces random conformational fluctuations by coupling of several replicas. It is shown that this protocol refines the majority of models from the CASP11 refinement category and that larger conformational changes of the starting structure are possible than with current state of the art methods. The performance of this protocol in the CASP11 experiment is discussed. We found that the quality of the refined model is correlated with the structural variance of the coupled replicas, which therefore provides a good estimator of model quality. Furthermore, some remarkable refinement results are discussed in detail. Proteins 2016; 84(Suppl 1):302-313. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Dennis Della Corte
- Institute of Complex Systems (ICS-6), Forschungszentrum Jülich, Jülich, 52425, Germany
| | - André Wildberg
- Institute of Complex Systems (ICS-6), Forschungszentrum Jülich, Jülich, 52425, Germany
| | - Gunnar F Schröder
- Institute of Complex Systems (ICS-6), Forschungszentrum Jülich, Jülich, 52425, Germany. .,Physics Department, University of Düsseldorf, Düsseldorf, 40225, Germany.
| |
Collapse
|
27
|
Park H, DiMaio F, Baker D. CASP11 refinement experiments with ROSETTA. Proteins 2015; 84 Suppl 1:314-22. [PMID: 26205421 PMCID: PMC4724349 DOI: 10.1002/prot.24862] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 07/19/2015] [Accepted: 07/21/2015] [Indexed: 12/28/2022]
Abstract
We report new Rosetta-based approaches to tackling the major issues that confound protein structure refinement, and the testing of these approaches in the CASP11 experiment. Automated refinement protocols were developed that integrate a range of sampling methods using parallel computation and multiobjective optimization. In CASP11, we used a more aggressive large-scale structure rebuilding approach for poor starting models, and a less aggressive local rebuilding plus core refinement approach for starting models likely to be closer to the native structure. The more incorrectly modeled a structure was predicted to be, the more it was allowed to vary during refinement. The CASP11 experiment revealed strengths and weaknesses of the approaches: the high-resolution strategy incorporating local rebuilding with core refinement consistently improved starting structures, while the low-resolution strategy incorporating the reconstruction of large parts of the structures improved starting models in some cases but often considerably worsened them, largely because of model selection issues. Overall, the results suggest the high-resolution refinement protocol is a promising method orthogonal to other approaches, while the low-resolution refinement method clearly requires further development. Proteins 2016; 84(Suppl 1):314-322. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington, 98195.,Institute for Protein Design, University of Washington, Seattle, Washington, 98195
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, Washington, 98195.,Institute for Protein Design, University of Washington, Seattle, Washington, 98195
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington, 98195. .,Institute for Protein Design, University of Washington, Seattle, Washington, 98195. .,Howard Hughes Medical Institute, University of Washington, Seattle, Washington, 98195.
| |
Collapse
|
28
|
Xun S, Jiang F, Wu YD. Significant Refinement of Protein Structure Models Using a Residue-Specific Force Field. J Chem Theory Comput 2015; 11:1949-56. [PMID: 26574396 DOI: 10.1021/acs.jctc.5b00029] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
An important application of all-atom explicit-solvent molecular dynamics (MD) simulations is the refinement of protein structures from low-resolution experiments or template-based modeling. A critical requirement is that the native structure is stable with the force field. We have applied a recently developed residue-specific force field, RSFF1, to a set of 30 refinement targets from recent CASP experiments. Starting from their experimental structures, 1.0 μs unrestrained simulations at 298 K retain most of the native structures quite well except for a few flexible terminals and long internal loops. Starting from each homology model, a 150 ns MD simulation at 380 K generates the best RMSD improvement of 0.85 Å on average. The structural improvements roughly correlate with the RMSD of the initial homology models, indicating possible consistent structure refinement. Finally, targets TR614 and TR624 have been subjected to long-time replica-exchange MD simulations. Significant structural improvements are generated, with RMSD of 1.91 and 1.36 Å with respect to their crystal structures. Thus, it is possible to achieve realistic refinement of protein structure models to near-experimental accuracy, using accurate force field with sufficient conformational sampling.
Collapse
Affiliation(s)
- Sangni Xun
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Yun-Dong Wu
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China.,College of Chemistry and Molecular Engineering, Peking University , Beijing, 100871, China
| |
Collapse
|
29
|
Xue Y, Skrynnikov NR. Ensemble MD simulations restrained via crystallographic data: accurate structure leads to accurate dynamics. Protein Sci 2015; 23:488-507. [PMID: 24452989 DOI: 10.1002/pro.2433] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Revised: 01/06/2014] [Accepted: 01/18/2014] [Indexed: 11/07/2022]
Abstract
Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for (15) N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields.
Collapse
Affiliation(s)
- Yi Xue
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana, 47907-2084, USA
| | | |
Collapse
|
30
|
Khoury GA, Liwo A, Khatib F, Zhou H, Chopra G, Bacardit J, Bortot LO, Faccioli RA, Deng X, He Y, Krupa P, Li J, Mozolewska MA, Sieradzan AK, Smadbeck J, Wirecki T, Cooper S, Flatten J, Xu K, Baker D, Cheng J, Delbem ACB, Floudas CA, Keasar C, Levitt M, Popović Z, Scheraga HA, Skolnick J, Crivelli SN, Players F. WeFold: a coopetition for protein structure prediction. Proteins 2014; 82:1850-68. [PMID: 24677212 PMCID: PMC4249725 DOI: 10.1002/prot.24538] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Revised: 01/25/2014] [Accepted: 02/08/2014] [Indexed: 12/19/2022]
Abstract
The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at "coopetition" in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org.
Collapse
Affiliation(s)
- George A. Khoury
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Poland
| | - Firas Khatib
- Department of Biochemistry, University of Washington, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | - Gaurav Chopra
- Department of Structural Biology, School of Medicine, Stanford University, USA
- Diabetes Center, School of Medicine, University of California San Francisco (UCSF), USA
| | - Jaume Bacardit
- School of Computing Science, Newcastle University, United Kingdom
| | - Leandro O. Bortot
- Laboratory of Biological Physics, Faculty of Pharmaceutical Sciences at Ribeirão Preto, University of São Paulo, Brazil
| | - Rodrigo A. Faccioli
- Institute of Mathematical and Computer Sciences, University of São Paulo, Brazil
| | - Xin Deng
- Department of Computer Science, University of Missouri, USA
| | - Yi He
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Pawel Krupa
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jilong Li
- Department of Computer Science, University of Missouri, USA
| | - Magdalena A. Mozolewska
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | | | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Tomasz Wirecki
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Seth Cooper
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Jeff Flatten
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Kefan Xu
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - David Baker
- Department of Biochemistry, University of Washington, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, USA
| | | | | | - Chen Keasar
- Departments of Computer Science and Life Sciences, Ben Gurion University of the Negev, Israel
| | - Michael Levitt
- Department of Structural Biology, School of Medicine, Stanford University, USA
| | - Zoran Popović
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Harold A. Scheraga
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | | | | |
Collapse
|
31
|
Chen Y, Shang Y, Xu D. Multi-Dimensional Scaling and MODELLER-Based Evolutionary Algorithms for Protein Model Refinement. PROCEEDINGS OF THE ... CONGRESS ON EVOLUTIONARY COMPUTATION. CONGRESS ON EVOLUTIONARY COMPUTATION 2014; 2014:1038-1045. [PMID: 25844403 PMCID: PMC4380876 DOI: 10.1109/cec.2014.6900443] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Protein structure prediction, i.e., computationally predicting the three-dimensional structure of a protein from its primary sequence, is one of the most important and challenging problems in bioinformatics. Model refinement is a key step in the prediction process, where improved structures are constructed based on a pool of initially generated models. Since the refinement category was added to the biennial Critical Assessment of Structure Prediction (CASP) in 2008, CASP results show that it is a challenge for existing model refinement methods to improve model quality consistently. This paper presents three evolutionary algorithms for protein model refinement, in which multidimensional scaling(MDS), the MODELLER software, and a hybrid of both are used as crossover operators, respectively. The MDS-based method takes a purely geometrical approach and generates a child model by combining the contact maps of multiple parents. The MODELLER-based method takes a statistical and energy minimization approach, and uses the remodeling module in MODELLER program to generate new models from multiple parents. The hybrid method first generates models using the MDS-based method and then run them through the MODELLER-based method, aiming at combining the strength of both. Promising results have been obtained in experiments using CASP datasets. The MDS-based method improved the best of a pool of predicted models in terms of the global distance test score (GDT-TS) in 9 out of 16test targets.
Collapse
Affiliation(s)
- Yan Chen
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| | - Yi Shang
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| | - Dong Xu
- Yan Chen, Yi Shang, and Dong Xu are with the Department of Computer Science, University of Missouri, Columbia, MO 65211 USA. Dong Xu is also with the Christopher S. Bond Life Science Center, University of Missouri. (, , and )
| |
Collapse
|
32
|
Mirjalili V, Noyes K, Feig M. Physics-based protein structure refinement through multiple molecular dynamics trajectories and structure averaging. Proteins 2014; 82 Suppl 2:196-207. [PMID: 23737254 PMCID: PMC4212311 DOI: 10.1002/prot.24336] [Citation(s) in RCA: 92] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Revised: 04/30/2013] [Accepted: 05/09/2013] [Indexed: 12/26/2022]
Abstract
We used molecular dynamics (MD) simulations for structure refinement of Critical Assessment of Techniques for Protein Structure Prediction 10 (CASP10) targets. Refinement was achieved by selecting structures from the MD-based ensembles followed by structural averaging. The overall performance of this method in CASP10 is described, and specific aspects are analyzed in detail to provide insight into key components. In particular, the use of different restraint types, sampling from multiple short simulations versus a single long simulation, the success of a quality assessment criterion, the application of scoring versus averaging, and the impact of a final refinement step are discussed in detail.
Collapse
Affiliation(s)
- Vahid Mirjalili
- Department of Mechanical Engineering Michigan State University East Lansing, MI 48824; USA
- Department of Biochemistry and Molecular Biology Michigan State University East Lansing, MI 48824; USA
| | - Keenan Noyes
- Department of Chemistry Michigan State University East Lansing, MI 48824; USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology Michigan State University East Lansing, MI 48824; USA
- Department of Chemistry Michigan State University East Lansing, MI 48824; USA
| |
Collapse
|
33
|
Nugent T, Cozzetto D, Jones DT. Evaluation of predictions in the CASP10 model refinement category. Proteins 2014; 82 Suppl 2:98-111. [PMID: 23900810 PMCID: PMC4282348 DOI: 10.1002/prot.24377] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Revised: 06/19/2013] [Accepted: 06/28/2013] [Indexed: 12/24/2022]
Abstract
Here we report on the assessment results of the third experiment to evaluate the state of the art in protein model refinement, where participants were invited to improve the accuracy of initial protein models for 27 targets. Using an array of complementary evaluation measures, we find that five groups performed better than the naïve (null) method—a marked improvement over CASP9, although only three were significantly better. The leading groups also demonstrated the ability to consistently improve both backbone and side chain positioning, while other groups reliably enhanced other aspects of protein physicality. The top-ranked group succeeded in improving the backbone conformation in almost 90% of targets, suggesting a strategy that for the first time in CASP refinement is successful in a clear majority of cases. A number of issues remain unsolved: the majority of groups still fail to improve the quality of the starting models; even successful groups are only able to make modest improvements; and no prediction is more similar to the native structure than to the starting model. Successful refinement attempts also often go unrecognized, as suggested by the relatively larger improvements when predictions not submitted as model 1 are also considered. Proteins 2014; 82(Suppl 2):98–111.
Collapse
Affiliation(s)
- Timothy Nugent
- Department of Computer Science Bioinformatics Group, University College London, London, WC1E 6BT, United Kingdom
| | | | | |
Collapse
|
34
|
Khoury GA, Tamamis P, Pinnaduwage N, Smadbeck J, Kieslich CA, Floudas CA. Princeton_TIGRESS: protein geometry refinement using simulations and support vector machines. Proteins 2013; 82:794-814. [PMID: 24174311 DOI: 10.1002/prot.24459] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Revised: 10/18/2013] [Accepted: 10/22/2013] [Indexed: 12/30/2022]
Abstract
Protein structure refinement aims to perform a set of operations given a predicted structure to improve model quality and accuracy with respect to the native in a blind fashion. Despite the numerous computational approaches to the protein refinement problem reported in the previous three CASPs, an overwhelming majority of methods degrade models rather than improve them. We initially developed a method tested using blind predictions during CASP10 which was officially ranked in 5th place among all methods in the refinement category. Here, we present Princeton_TIGRESS, which when benchmarked on all CASP 7,8,9, and 10 refinement targets, simultaneously increased GDT_TS 76% of the time with an average improvement of 0.83 GDT_TS points per structure. The method was additionally benchmarked on models produced by top performing three-dimensional structure prediction servers during CASP10. The robustness of the Princeton_TIGRESS protocol was also tested for different random seeds. We make the Princeton_TIGRESS refinement protocol freely available as a web server at http://atlas.princeton.edu/refinement. Using this protocol, one can consistently refine a prediction to help bridge the gap between a predicted structure and the actual native structure.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, 08540
| | | | | | | | | | | |
Collapse
|
35
|
Khoury GA, Smadbeck J, Kieslich CA, Floudas CA. Protein folding and de novo protein design for biotechnological applications. Trends Biotechnol 2013; 32:99-109. [PMID: 24268901 DOI: 10.1016/j.tibtech.2013.10.008] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Revised: 10/10/2013] [Accepted: 10/18/2013] [Indexed: 11/19/2022]
Abstract
In the postgenomic era, the medical/biological fields are advancing faster than ever. However, before the power of full-genome sequencing can be fully realized, the connection between amino acid sequence and protein structure, known as the protein folding problem, needs to be elucidated. The protein folding problem remains elusive, with significant difficulties still arising when modeling amino acid sequences lacking an identifiable template. Understanding protein folding will allow for unforeseen advances in protein design; often referred to as the inverse protein folding problem. Despite challenges in protein folding, de novo protein design has recently demonstrated significant success via computational techniques. We review advances and challenges in protein structure prediction and de novo protein design, and highlight their interplay in successful biotechnological applications.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Chris A Kieslich
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Christodoulos A Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
36
|
Nugent T, Jones DT. Membrane protein orientation and refinement using a knowledge-based statistical potential. BMC Bioinformatics 2013; 14:276. [PMID: 24047460 PMCID: PMC3852961 DOI: 10.1186/1471-2105-14-276] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 09/05/2013] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Recent increases in the number of deposited membrane protein crystal structures necessitate the use of automated computational tools to position them within the lipid bilayer. Identifying the correct orientation allows us to study the complex relationship between sequence, structure and the lipid environment, which is otherwise challenging to investigate using experimental techniques due to the difficulty in crystallising membrane proteins embedded within intact membranes. RESULTS We have developed a knowledge-based membrane potential, calculated by the statistical analysis of transmembrane protein structures, coupled with a combination of genetic and direct search algorithms, and demonstrate its use in positioning proteins in membranes, refinement of membrane protein models and in decoy discrimination. CONCLUSIONS Our method is able to quickly and accurately orientate both alpha-helical and beta-barrel membrane proteins within the lipid bilayer, showing closer agreement with experimentally determined values than existing approaches. We also demonstrate both consistent and significant refinement of membrane protein models and the effective discrimination between native and decoy structures. Source code is available under an open source license from http://bioinf.cs.ucl.ac.uk/downloads/memembed/.
Collapse
Affiliation(s)
- Timothy Nugent
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK
| | - David T Jones
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
37
|
Bhattacharya D, Cheng J. i3Drefine software for protein 3D structure refinement and its assessment in CASP10. PLoS One 2013; 8:e69648. [PMID: 23894517 PMCID: PMC3716612 DOI: 10.1371/journal.pone.0069648] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 06/13/2013] [Indexed: 12/25/2022] Open
Abstract
Protein structure refinement refers to the process of improving the qualities of protein structures during structure modeling processes to bring them closer to their native states. Structure refinement has been drawing increasing attention in the community-wide Critical Assessment of techniques for Protein Structure prediction (CASP) experiments since its addition in 8th CASP experiment. During the 9th and recently concluded 10th CASP experiments, a consistent growth in number of refinement targets and participating groups has been witnessed. Yet, protein structure refinement still remains a largely unsolved problem with majority of participating groups in CASP refinement category failed to consistently improve the quality of structures issued for refinement. In order to alleviate this need, we developed a completely automated and computationally efficient protein 3D structure refinement method, i3Drefine, based on an iterative and highly convergent energy minimization algorithm with a powerful all-atom composite physics and knowledge-based force fields and hydrogen bonding (HB) network optimization technique. In the recent community-wide blind experiment, CASP10, i3Drefine (as ‘MULTICOM-CONSTRUCT’) was ranked as the best method in the server section as per the official assessment of CASP10 experiment. Here we provide the community with free access to i3Drefine software and systematically analyse the performance of i3Drefine in strict blind mode on the refinement targets issued in CASP10 refinement category and compare with other state-of-the-art refinement methods participating in CASP10. Our analysis demonstrates that i3Drefine is only fully-automated server participating in CASP10 exhibiting consistent improvement over the initial structures in both global and local structural quality metrics. Executable version of i3Drefine is freely available at http://protein.rnet.missouri.edu/i3drefine/.
Collapse
Affiliation(s)
- Debswapna Bhattacharya
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
| | - Jianlin Cheng
- Department of Computer Science, Informatics Institute, Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
38
|
Heo L, Park H, Seok C. GalaxyRefine: Protein structure refinement driven by side-chain repacking. Nucleic Acids Res 2013; 41:W384-8. [PMID: 23737448 PMCID: PMC3692086 DOI: 10.1093/nar/gkt458] [Citation(s) in RCA: 658] [Impact Index Per Article: 59.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The quality of model structures generated by contemporary protein structure prediction methods strongly depends on the degree of similarity between the target and available template structures. Therefore, the importance of improving template-based model structures beyond the accuracy available from template information has been emphasized in the structure prediction community. The GalaxyRefine web server, freely available at http://galaxy.seoklab.org/refine, is based on a refinement method that has been successfully tested in CASP10. The method first rebuilds side chains and performs side-chain repacking and subsequent overall structure relaxation by molecular dynamics simulation. According to the CASP10 assessment, this method showed the best performance in improving the local structure quality. The method can improve both global and local structure quality on average, when used for refining the models generated by state-of-the-art protein structure prediction servers.
Collapse
Affiliation(s)
- Lim Heo
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | | | | |
Collapse
|
39
|
Adams PD, Baker D, Brunger AT, Das R, DiMaio F, Read RJ, Richardson DC, Richardson JS, Terwilliger TC. Advances, interactions, and future developments in the CNS, Phenix, and Rosetta structural biology software systems. Annu Rev Biophys 2013; 42:265-87. [PMID: 23451892 DOI: 10.1146/annurev-biophys-083012-130253] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Advances in our understanding of macromolecular structure come from experimental methods, such as X-ray crystallography, and also computational analysis of the growing number of atomic models obtained from such experiments. The later analyses have made it possible to develop powerful tools for structure prediction and optimization in the absence of experimental data. In recent years, a synergy between these computational methods for crystallographic structure determination and structure prediction and optimization has begun to be exploited. We review some of the advances in the algorithms used for crystallographic structure determination in the Phenix and Crystallography & NMR System software packages and describe how methods from ab initio structure prediction and refinement in Rosetta have been applied to challenging crystallographic problems. The prospects for future improvement of these methods are discussed.
Collapse
Affiliation(s)
- Paul D Adams
- Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Mirjalili V, Feig M. Protein Structure Refinement through Structure Selection and Averaging from Molecular Dynamics Ensembles. J Chem Theory Comput 2013; 9:1294-1303. [PMID: 23526422 PMCID: PMC3603382 DOI: 10.1021/ct300962x] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
A molecular dynamics (MD) simulation based protocol for structure refinement of template-based model predictions is described. The protocol involves the application of restraints, ensemble averaging of selected subsets, interpolation between initial and refined structures, and assessment of refinement success. It is found that sub-microsecond MD-based sampling when combined with ensemble averaging can produce moderate but consistent refinement for most systems in the CASP targets considered here.
Collapse
Affiliation(s)
- Vahid Mirjalili
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Mechanical Engineering, Michigan State University, East Lansing, MI 48824; USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Chemistry, Michigan State University, East Lansing, MI 48824; USA
| |
Collapse
|
41
|
Bhattacharya D, Cheng J. 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 2013; 81:119-31. [PMID: 22927229 PMCID: PMC3634918 DOI: 10.1002/prot.24167] [Citation(s) in RCA: 122] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/26/2012] [Accepted: 08/17/2012] [Indexed: 12/27/2022]
Abstract
One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low-resolution predicted models to high-resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two-step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/.
Collapse
Affiliation(s)
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
- Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
42
|
Kaufmann KW, Meiler J. Using RosettaLigand for small molecule docking into comparative models. PLoS One 2012; 7:e50769. [PMID: 23239984 PMCID: PMC3519832 DOI: 10.1371/journal.pone.0050769] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/24/2012] [Indexed: 11/18/2022] Open
Abstract
Computational small molecule docking into comparative models of proteins is widely used to query protein function and in the development of small molecule therapeutics. We benchmark RosettaLigand docking into comparative models for nine proteins built during CASP8 that contain ligands. We supplement the study with 21 additional protein/ligand complexes to cover a wider space of chemotypes. During a full docking run in 21 of the 30 cases, RosettaLigand successfully found a native-like binding mode among the top ten scoring binding modes. From the benchmark cases we find that careful template selection based on ligand occupancy provides the best chance of success while overall sequence identity between template and target do not appear to improve results. We also find that binding energy normalized by atom number is often less than -0.4 in native-like binding modes.
Collapse
Affiliation(s)
- Kristian W. Kaufmann
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
43
|
Olson MA, Lee MS. Structure refinement of protein model decoys requires accurate side-chain placement. Proteins 2012; 81:469-78. [PMID: 23070940 DOI: 10.1002/prot.24204] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Revised: 09/18/2012] [Accepted: 10/02/2012] [Indexed: 11/10/2022]
Abstract
In this study, the application of temperature-based replica-exchange (T-ReX) simulations for structure refinement of decoys taken from the I-TASSER dataset was examined. A set of eight nonredundant proteins was investigated using self-guided Langevin dynamics (SGLD) with a generalized Born implicit solvent model to sample conformational space. For two of the protein test cases, a comparison of the SGLD/T-ReX method with that of a hybrid explicit/implicit solvent molecular dynamics T-ReX simulation model is provided. Additionally, the effect of side-chain placement among the starting decoy structures, using alternative rotamer conformations taken from the SCWRL4 modeling program, was investigated. The simulation results showed that, despite having near-native backbone conformations among the starting decoys, the determinant of their refinement is side-chain packing to a level that satisfies a minimum threshold of native contacts to allow efficient excursions toward the downhill refinement regime on the energy landscape. By repacking using SCWRL4 and by applying the RWplus statistical potential for structure identification, the SGLD/T-ReX simulations achieved refinement to an average of 38% increase in the number of native contacts relative to the original I-TASSER decoy sets and a 25% reduction in values of C(α) root-mean-square deviation. The hybrid model succeeded in obtaining a sharper funnel to low-energy states for a modeled target than the implicit solvent SGLD model; yet, structure identification remained roughly the same. Without meeting a threshold of near-native packing of side chains, the T-ReX simulations degrade the accuracy of the decoys, and subsequently, refinement becomes tantamount to the protein folding problem.
Collapse
Affiliation(s)
- Mark A Olson
- Department of Cell Biology and Biochemistry, USAMRIID, Frederick, Maryland 21702, USA.
| | | |
Collapse
|
44
|
Chitsaz M, Mayo SL. GRID: a high-resolution protein structure refinement algorithm. J Comput Chem 2012; 34:445-50. [PMID: 23065773 DOI: 10.1002/jcc.23151] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 07/31/2012] [Accepted: 08/27/2012] [Indexed: 12/27/2022]
Abstract
The energy-based refinement of protein structures generated by fold prediction algorithms to atomic-level accuracy remains a major challenge in structural biology. Energy-based refinement is mainly dependent on two components: (1) sufficiently accurate force fields, and (2) efficient conformational space search algorithms. Focusing on the latter, we developed a high-resolution refinement algorithm called GRID. It takes a three-dimensional protein structure as input and, using an all-atom force field, attempts to improve the energy of the structure by systematically perturbing backbone dihedrals and side-chain rotamer conformations. We compare GRID to Backrub, a stochastic algorithm that has been shown to predict a significant fraction of the conformational changes that occur with point mutations. We applied GRID and Backrub to 10 high-resolution (≤ 2.8 Å) crystal structures from the Protein Data Bank and measured the energy improvements obtained and the computation times required to achieve them. GRID resulted in energy improvements that were significantly better than those attained by Backrub while expending about the same amount of computational resources. GRID resulted in relaxed structures that had slightly higher backbone RMSDs compared to Backrub relative to the starting crystal structures. The average RMSD was 0.25 ± 0.02 Å for GRID versus 0.14 ± 0.04 Å for Backrub. These relatively minor deviations indicate that both algorithms generate structures that retain their original topologies, as expected given the nature of the algorithms.
Collapse
Affiliation(s)
- Mohsen Chitsaz
- Biochemistry and Molecular Biophysics Option, California Institute of Technology, Pasadena, California 91125, USA
| | | |
Collapse
|
45
|
Gniewek P, Kolinski A, Jernigan RL, Kloczkowski A. Elastic network normal modes provide a basis for protein structure refinement. J Chem Phys 2012; 136:195101. [PMID: 22612113 DOI: 10.1063/1.4710986] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
It is well recognized that thermal motions of atoms in the protein native state, the fluctuations about the minimum of the global free energy, are well reproduced by the simple elastic network models (ENMs) such as the anisotropic network model (ANM). Elastic network models represent protein dynamics as vibrations of a network of nodes (usually represented by positions of the heavy atoms or by the C(α) atoms only for coarse-grained representations) in which the spatially close nodes are connected by harmonic springs. These models provide a reliable representation of the fluctuational dynamics of proteins and RNA, and explain various conformational changes in protein structures including those important for ligand binding. In the present paper, we study the problem of protein structure refinement by analyzing thermal motions of proteins in non-native states. We represent the conformational space close to the native state by a set of decoys generated by the I-TASSER protein structure prediction server utilizing template-free modeling. The protein substates are selected by hierarchical structure clustering. The main finding is that thermal motions for some substates, overlap significantly with the deformations necessary to reach the native state. Additionally, more mobile residues yield higher overlaps with the required deformations than do the less mobile ones. These findings suggest that structural refinement of poorly resolved protein models can be significantly enhanced by reduction of the conformational space to the motions imposed by the dominant normal modes.
Collapse
Affiliation(s)
- Pawel Gniewek
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | | | | | | |
Collapse
|
46
|
Raval A, Piana S, Eastwood MP, Dror RO, Shaw DE. Refinement of protein structure homology models via long, all-atom molecular dynamics simulations. Proteins 2012; 80:2071-9. [PMID: 22513870 DOI: 10.1002/prot.24098] [Citation(s) in RCA: 183] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2012] [Revised: 04/03/2012] [Accepted: 04/11/2012] [Indexed: 11/07/2022]
Abstract
Accurate computational prediction of protein structure represents a longstanding challenge in molecular biology and structure-based drug design. Although homology modeling techniques are widely used to produce low-resolution models, refining these models to high resolution has proven difficult. With long enough simulations and sufficiently accurate force fields, molecular dynamics (MD) simulations should in principle allow such refinement, but efforts to refine homology models using MD have for the most part yielded disappointing results. It has thus far been unclear whether MD-based refinement is limited primarily by accessible simulation timescales, force field accuracy, or both. Here, we examine MD as a technique for homology model refinement using all-atom simulations, each at least 100 μs long-more than 100 times longer than previous refinement simulations-and a physics-based force field that was recently shown to successfully fold a structurally diverse set of fast-folding proteins. In MD simulations of 24 proteins chosen from the refinement category of recent Critical Assessment of Structure Prediction (CASP) experiments, we find that in most cases, simulations initiated from homology models drift away from the native structure. Comparison with simulations initiated from the native structure suggests that force field accuracy is the primary factor limiting MD-based refinement. This problem can be mitigated to some extent by restricting sampling to the neighborhood of the initial model, leading to structural improvement that, while limited, is roughly comparable to the leading alternative methods.
Collapse
Affiliation(s)
- Alpan Raval
- D E Shaw Research, New York, New York 10036, USA
| | | | | | | | | |
Collapse
|
47
|
Rodrigues JPGLM, Levitt M, Chopra G. KoBaMIN: a knowledge-based minimization web server for protein structure refinement. Nucleic Acids Res 2012; 40:W323-8. [PMID: 22564897 PMCID: PMC3394243 DOI: 10.1093/nar/gks376] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The KoBaMIN web server provides an online interface to a simple, consistent and computationally efficient protein structure refinement protocol based on minimization of a knowledge-based potential of mean force. The server can be used to refine either a single protein structure or an ensemble of proteins starting from their unrefined coordinates in PDB format. The refinement method is particularly fast and accurate due to the underlying knowledge-based potential derived from structures deposited in the PDB; as such, the energy function implicitly includes the effects of solvent and the crystal environment. Our server allows for an optional but recommended step that optimizes stereochemistry using the MESHI software. The KoBaMIN server also allows comparison of the refined structures with a provided reference structure to assess the changes brought about by the refinement protocol. The performance of KoBaMIN has been benchmarked widely on a large set of decoys, all models generated at the seventh worldwide experiments on critical assessment of techniques for protein structure prediction (CASP7) and it was also shown to produce top-ranking predictions in the refinement category at both CASP8 and CASP9, yielding consistently good results across a broad range of model quality values. The web server is fully functional and freely available at http://csb.stanford.edu/kobamin.
Collapse
Affiliation(s)
- João P G L M Rodrigues
- Department of Structural Biology, 299 Campus Dr W, Fairchild Bldg, Room D100, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|
48
|
Perez A, Yang Z, Bahar I, Dill KA, MacCallum JL. FlexE: Using elastic network models to compare models of protein structure. J Chem Theory Comput 2012; 8:3985-3991. [PMID: 25530735 DOI: 10.1021/ct300148f] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
It is often valuable to compare protein structures to determine how similar they are. Structure comparison methods such as RMSD and GDT-TS are based solely on fixed geometry and do not take into account the intrinsic flexibility or energy landscape of the protein. We propose a method, which we call FlexE, that is based on a simple elastic network model and uses the deformation energy as measure of the similarity between two structures. FlexE can distinguish biologically relevant conformational changes from random changes, while existing geometry-based methods cannot. Additionally, FlexE incorporates the concept of thermal energy, which provides a rational way to determine when two models are "the same". FlexE provides a unique measure of the similarity between protein structures that is complementary to existing methods.
Collapse
Affiliation(s)
- Alberto Perez
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794-5252
| | - Zheng Yang
- Department of Computational and Systems Biology, and Clinical & Translational Science Institute, School of Medicine, University of Pittsburgh, 3064 BST3, 3501 Fifth Ave, Pittsburgh, PA 15213
| | - Ivet Bahar
- Department of Computational and Systems Biology, and Clinical & Translational Science Institute, School of Medicine, University of Pittsburgh, 3064 BST3, 3501 Fifth Ave, Pittsburgh, PA 15213
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794-5252
| | - Justin L MacCallum
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794-5252
| |
Collapse
|
49
|
Xu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 2012; 80:1715-35. [PMID: 22411565 DOI: 10.1002/prot.24065] [Citation(s) in RCA: 590] [Impact Index Per Article: 49.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Revised: 01/23/2012] [Accepted: 03/03/2012] [Indexed: 11/09/2022]
Abstract
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field.
Collapse
Affiliation(s)
- Dong Xu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
50
|
Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 2012; 19:1784-95. [PMID: 22153501 DOI: 10.1016/j.str.2011.09.022] [Citation(s) in RCA: 248] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 09/19/2011] [Accepted: 09/24/2011] [Indexed: 11/22/2022]
Abstract
One of critical difficulties of molecular dynamics (MD) simulations in protein structure refinement is that the physics-based energy landscape lacks a middle-range funnel to guide nonnative conformations toward near-native states. We propose to use the target model as a probe to identify fragmental analogs from PDB. The distance maps are then used to reshape the MD energy funnel. The protocol was tested on 181 benchmarking and 26 CASP targets. It was found that structure models of correct folds with TM-score >0.5 can be often pulled closer to native with higher GDT-HA score, but improvement for the models of incorrect folds (TM-score <0.5) are much less pronounced. These data indicate that template-based fragmental distance maps essentially reshaped the MD energy landscape from golf-course-like to funnel-like ones in the successfully refined targets with a radius of TM-score ∼0.5. These results demonstrate a new avenue to improve high-resolution structures by combining knowledge-based template information with physics-based MD simulations.
Collapse
|