1
|
Adiyaman R, McGuffin LJ. Using Local Protein Model Quality Estimates to Guide a Molecular Dynamics-Based Refinement Strategy. Methods Mol Biol 2023; 2627:119-140. [PMID: 36959445 DOI: 10.1007/978-1-0716-2974-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The refinement of predicted 3D models aims to bring them closer to the native structure by fixing errors including unusual bonds and torsion angles and irregular hydrogen bonding patterns. Refinement approaches based on molecular dynamics (MD) simulations using different types of restraints have performed well since CASP10. ReFOLD, developed by the McGuffin group, was one of the many MD-based refinement approaches, which were tested in CASP 12. When the performance of the ReFOLD method in CASP12 was evaluated, it was observed that ReFOLD suffered from the absence of a reliable guidance mechanism to reach consistent improvement for the quality of predicted 3D models, particularly in the case of template-based modelling (TBM) targets. Therefore, here we propose to utilize the local quality assessment score produced by ModFOLD6 to guide the MD-based refinement approach to further increase the accuracy of the predicted 3D models. The relative performance of the new local quality assessment guided MD-based refinement protocol and the original MD-based protocol ReFOLD are compared utilizing many different official scoring methods. By using the per-residue accuracy (or local quality) score to guide the refinement process, we are able to prevent the refined models from undesired structural deviations, thereby leading to more consistent improvements. This chapter will include a detailed analysis of the performance of the local quality assessment guided MD-based protocol versus that deployed in the original ReFOLD method.
Collapse
Affiliation(s)
- Recep Adiyaman
- School of Biological Sciences, University of Reading, Reading, UK
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
2
|
Voronin A, Schug A. Selection of representative structures from large biomolecular ensembles. J Chem Phys 2022; 156:144102. [DOI: 10.1063/5.0082444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Despite the incredible progress of experimental techniques, protein structure determination still remains a challenging task. Due to the rapid improvements of computer technology, simulations are often used to complement or interpret experimental data, in particular for sparse or low-resolution data. Many such in silico methods allow to obtain highly accurate models of a protein structure either de novo or via refinement of a physical model with experimental restraints. One crucial question is how to select a representative member or ensemble out of vast number of computationally generated structures. Here, we introduce such a method. As a representative task, we add co-evolutionary contact pairs as distance restraints to a physical force field and want to select a good characterization of the resulting native-like ensemble. To generate large ensembles, we run replica-exchange molecular dynamics (REMD) on five mid-sized test proteins and over a wide temperature range. High temperatures allow overcoming energetic barriers while low temperatures perform local searches of native-like conformations. The integrated bias is based on co-evolutionary contact pairs derived from a deep residual neural network to guide the simulation towards native-like conformations. We shortly compare and discuss the achieved model precision of contact-guided REMD for mid-sized proteins. Lastly, we discuss four robust ensemble-selection algorithms in great detail which are capable to extract the representative structure models with a high certainty. To assess the performance of the selection algorithms we exemplarily mimic a "blind scenario', i.e. where the target structure is unknown, and select a representative structural ensemble of native-like folds.
Collapse
Affiliation(s)
| | - Alexander Schug
- Forschungszentrum Jülich, Forschungszentrum Jülich Jülich Supercomputing Centre, Germany
| |
Collapse
|
3
|
Maritan M, Autin L, Karr J, Covert MW, Olson AJ, Goodsell DS. Building Structural Models of a Whole Mycoplasma Cell. J Mol Biol 2022; 434:167351. [PMID: 34774566 PMCID: PMC8752489 DOI: 10.1016/j.jmb.2021.167351] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 11/04/2021] [Accepted: 11/05/2021] [Indexed: 02/01/2023]
Abstract
Building structural models of entire cells has been a long-standing cross-discipline challenge for the research community, as it requires an unprecedented level of integration between multiple sources of biological data and enhanced methods for computational modeling and visualization. Here, we present the first 3D structural models of an entire Mycoplasma genitalium (MG) cell, built using the CellPACK suite of computational modeling tools. Our model recapitulates the data described in recent whole-cell system biology simulations and provides a structural representation for all MG proteins, DNA and RNA molecules, obtained by combining experimental and homology-modeled structures and lattice-based models of the genome. We establish a framework for gathering, curating and evaluating these structures, exposing current weaknesses of modeling methods and the boundaries of MG structural knowledge, and visualization methods to explore functional characteristics of the genome and proteome. We compare two approaches for data gathering, a manually-curated workflow and an automated workflow that uses homologous structures, both of which are appropriate for the analysis of mesoscale properties such as crowding and volume occupancy. Analysis of model quality provides estimates of the regularization that will be required when these models are used as starting points for atomic molecular dynamics simulations.
Collapse
Affiliation(s)
- Martina Maritan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037 USA. https://twitter.com/MartinaMaritan
| | - Ludovic Autin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037 USA. https://twitter.com/grinche
| | - Jonathan Karr
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Markus W Covert
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Arthur J Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037 USA
| | - David S Goodsell
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037 USA; RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
4
|
Holland J, Grigoryan G. Structure‐conditioned amino‐acid couplings: how contact geometry affects pairwise sequence preferences. Protein Sci 2022; 31:900-917. [PMID: 35060221 PMCID: PMC8927866 DOI: 10.1002/pro.4280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 01/06/2022] [Accepted: 01/12/2022] [Indexed: 11/11/2022]
Abstract
Relating a protein's sequence to its conformation is a central challenge for both structure prediction and sequence design. Statistical contact potentials, as well as their more descriptive versions that account for side‐chain orientation and other geometric descriptors, have served as simplistic but useful means of representing second‐order contributions in sequence–structure relationships. Here we ask what happens when a pairwise potential is conditioned on the fully defined geometry of interacting backbones fragments. We show that the resulting structure‐conditioned coupling energies more accurately reflect pair preferences as a function of structural contexts. These structure‐conditioned energies more reliably encode native sequence information and more highly correlate with experimentally determined coupling energies. Clustering a database of interaction motifs by structure results in ensembles of similar energies and clustering them by energy results in ensembles of similar structures. By comparing many pairs of interaction motifs and showing that structural similarity and energetic similarity go hand‐in‐hand, we provide a tangible link between modular sequence and structure elements. This link is applicable to structural modeling, and we show that scoring CASP models with structured‐conditioned energies results in substantially higher correlation with structural quality than scoring the same models with a contact potential. We conclude that structure‐conditioned coupling energies are a good way to model the impact of interaction geometry on second‐order sequence preferences.
Collapse
Affiliation(s)
- Jack Holland
- Department of Computer Science Dartmouth College Hanover New Hampshire USA
| | - Gevorg Grigoryan
- Department of Computer Science Dartmouth College Hanover New Hampshire USA
| |
Collapse
|
5
|
Heo L, Janson G, Feig M. Physics-based protein structure refinement in the era of artificial intelligence. Proteins 2021; 89:1870-1887. [PMID: 34156124 PMCID: PMC8616793 DOI: 10.1002/prot.26161] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 05/31/2021] [Accepted: 06/08/2021] [Indexed: 12/21/2022]
Abstract
Protein structure refinement is the last step in protein structure prediction pipelines. Physics-based refinement via molecular dynamics (MD) simulations has made significant progress during recent years. During CASP14, we tested a new refinement protocol based on an improved sampling strategy via MD simulations. MD simulations were carried out at an elevated temperature (360 K). An optimized use of biasing restraints and the use of multiple starting models led to enhanced sampling. The new protocol generally improved the model quality. In comparison with our previous protocols, the CASP14 protocol showed clear improvements. Our approach was successful with most initial models, many based on deep learning methods. However, we found that our approach was not able to refine machine-learning models from the AlphaFold2 group, often decreasing already high initial qualities. To better understand the role of refinement given new types of models based on machine-learning, a detailed analysis via MD simulations and Markov state modeling is presented here. We continue to find that MD-based refinement has the potential to improve AI predictions. We also identified several practical issues that make it difficult to realize that potential. Increasingly important is the consideration of inter-domain and oligomeric contacts in simulations; the presence of large kinetic barriers in refinement pathways also continues to present challenges. Finally, we provide a perspective on how physics-based refinement could continue to play a role in the future for improving initial predictions based on machine learning-based methods.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
6
|
Protein Structure Refinement Using Multi-Objective Particle Swarm Optimization with Decomposition Strategy. Int J Mol Sci 2021; 22:ijms22094408. [PMID: 33922489 PMCID: PMC8122964 DOI: 10.3390/ijms22094408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 04/16/2021] [Accepted: 04/20/2021] [Indexed: 12/02/2022] Open
Abstract
Protein structure refinement is a crucial step for more accurate protein structure predictions. Most existing approaches treat it as an energy minimization problem to intuitively improve the quality of initial models by searching for structures with lower energy. Considering that a single energy function could not reflect the accurate energy landscape of all the proteins, our previous AIR 1.0 pipeline uses multiple energy functions to realize a multi-objectives particle swarm optimization-based model refinement. It is expected to provide a general balanced conformation search protocol guided from different energy evaluations. However, AIR 1.0 solves the multi-objective optimization problem as a whole, which could not result in good solution diversity and convergence on some targets. In this study, we report a decomposition-based method AIR 2.0, which is an updated version of AIR, for protein structure refinement. AIR 2.0 decomposes a multi-objective optimization problem into a number of subproblems and optimizes them simultaneously using particle swarm optimization algorithm. The solutions yielded by AIR 2.0 show better convergence and diversity compared to its previous version, which increases the possibilities of digging out better structure conformations. The experimental results on CASP13 refinement benchmark targets and blind tests in CASP 14 demonstrate the efficacy of AIR 2.0.
Collapse
|
7
|
Hiranuma N, Park H, Baek M, Anishchenko I, Dauparas J, Baker D. Improved protein structure refinement guided by deep learning based accuracy estimation. Nat Commun 2021; 12:1340. [PMID: 33637700 PMCID: PMC7910447 DOI: 10.1038/s41467-021-21511-x] [Citation(s) in RCA: 117] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/18/2021] [Indexed: 11/22/2022] Open
Abstract
We develop a deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue-residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution, and the network should be broadly useful for assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error. Incorporation of the accuracy predictions at multiple stages in the Rosetta refinement protocol considerably increased the accuracy of the resulting protein structure models, illustrating how deep learning can improve search for global energy minima of biomolecules.
Collapse
Affiliation(s)
- Naozumi Hiranuma
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Washington, WA, USA
| | - Hahnbeom Park
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA
| | - Minkyung Baek
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA
| | - Ivan Anishchenko
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA
| | - Justas Dauparas
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA
| | - David Baker
- Department of Biochemistry and Institute for Protein Design, University of Washington, Washington, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Washington, WA, USA.
| |
Collapse
|
8
|
Heo L, Arbour CF, Janson G, Feig M. Improved Sampling Strategies for Protein Model Refinement Based on Molecular Dynamics Simulation. J Chem Theory Comput 2021; 17:1931-1943. [PMID: 33562962 DOI: 10.1021/acs.jctc.0c01238] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. These methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore the conformational space more broadly. Based on the insights of this analysis, we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Collin F Arbour
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
9
|
Bhattacharya D. refineD: improved protein structure refinement using machine learning based restrained relaxation. Bioinformatics 2020; 35:3320-3328. [PMID: 30759180 DOI: 10.1093/bioinformatics/btz101] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 01/22/2019] [Accepted: 02/11/2019] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Protein structure refinement aims to bring moderately accurate template-based protein models closer to the native state through conformational sampling. However, guiding the sampling towards the native state by effectively using restraints remains a major issue in structure refinement. RESULTS Here, we develop a machine learning based restrained relaxation protocol that uses deep discriminative learning based binary classifiers to predict multi-resolution probabilistic restraints from the starting structure and subsequently converts these restraints to be integrated into Rosetta all-atom energy function as additional scoring terms during structure refinement. We use four restraint resolutions as adopted in GDT-HA (0.5, 1, 2 and 4 Å), centered on the Cα atom of each residue that are predicted by ensemble of four deep discriminative classifiers trained using combinations of sequence and structure-derived features as well as several energy terms from Rosetta centroid scoring function. The proposed method, refineD, has been found to produce consistent and substantial structural refinement through the use of cumulative and non-cumulative restraints on 150 benchmarking targets. refineD outperforms unrestrained relaxation strategy or relaxation that is restrained to starting structures using the FastRelax application of Rosetta or atomic-level energy minimization based ModRefiner method as well as molecular dynamics (MD) simulation based FG-MD protocol. Furthermore, by adjusting restraint resolutions, the method addresses the tradeoff that exists between degree and consistency of refinement. These results demonstrate a promising new avenue for improving accuracy of template-based protein models by effectively guiding conformational sampling during structure refinement through the use of machine learning based restraints. AVAILABILITY AND IMPLEMENTATION http://watson.cse.eng.auburn.edu/refineD/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
10
|
Naik B, Gupta N, Ojha R, Singh S, Prajapati VK, Prusty D. High throughput virtual screening reveals SARS-CoV-2 multi-target binding natural compounds to lead instant therapy for COVID-19 treatment. Int J Biol Macromol 2020; 160:1-17. [PMID: 32470577 PMCID: PMC7250083 DOI: 10.1016/j.ijbiomac.2020.05.184] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 05/21/2020] [Accepted: 05/22/2020] [Indexed: 12/21/2022]
Abstract
The present-day world is severely suffering from the recently emerged SARS-CoV-2. The lack of prescribed drugs for the deadly virus has stressed the likely need to identify novel inhibitors to alleviate and stop the pandemic. In the present high throughput virtual screening study, we used in silico techniques like receptor-ligand docking, Molecular dynamic (MD), and ADME properties to screen natural compounds. It has been documented that many natural compounds display antiviral activities, including anti–SARS-CoV effect. The present study deals with compounds of Natural Product Activity and Species Source (NPASS) database with known biological activity that probably impedes the activity of six essential enzymes of the virus. Promising drug-like compounds were identified, demonstrating better docking score and binding energy for each druggable targets. After an extensive screening analysis, three novel multi-target natural compounds were predicted to subdue the activity of three/more major drug targets simultaneously. Concerning the utility of natural compounds in the formulation of many therapies, we propose these compounds as excellent lead candidates for the development of therapeutic drugs against SARS-CoV-2.
Collapse
Affiliation(s)
- Biswajit Naik
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India
| | - Nidhi Gupta
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India
| | - Rupal Ojha
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India
| | - Satyendra Singh
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India
| | - Vijay Kumar Prajapati
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India
| | - Dhaneswar Prusty
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, NH-8, Bandarsindri, Kishangarh, 305817 Ajmer, Rajasthan, India.
| |
Collapse
|
11
|
Piana S, Robustelli P, Tan D, Chen S, Shaw DE. Development of a Force Field for the Simulation of Single-Chain Proteins and Protein-Protein Complexes. J Chem Theory Comput 2020; 16:2494-2507. [PMID: 31914313 DOI: 10.1021/acs.jctc.9b00251] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The accuracy of atomistic physics-based force fields for the simulation of biological macromolecules has typically been benchmarked experimentally using biophysical data from simple, often single-chain systems. In the case of proteins, the careful refinement of force field parameters associated with torsion-angle potentials and the use of improved water models have enabled a great deal of progress toward the highly accurate simulation of such monomeric systems in both folded and, more recently, disordered states. In living organisms, however, proteins constantly interact with other macromolecules, such as proteins and nucleic acids, and these interactions are often essential for proper biological function. Here, we show that state-of-the-art force fields tuned to provide an accurate description of both ordered and disordered proteins can be limited in their ability to accurately describe protein-protein complexes. This observation prompted us to perform an extensive reparameterization of one variant of the Amber protein force field. Our objective involved refitting not only the parameters associated with torsion-angle potentials but also the parameters used to model nonbonded interactions, the specification of which is expected to be central to the accurate description of multicomponent systems. The resulting force field, which we call DES-Amber, allows for more accurate simulations of protein-protein complexes, while still providing a state-of-the-art description of both ordered and disordered single-chain proteins. Despite the improvements, calculated protein-protein association free energies still appear to deviate substantially from experiment, a result suggesting that more fundamental changes to the force field, such as the explicit treatment of polarization effects, may simultaneously further improve the modeling of single-chain proteins and protein-protein complexes.
Collapse
Affiliation(s)
- Stefano Piana
- D. E. Shaw Research, New York, New York 10036, United States
| | - Paul Robustelli
- D. E. Shaw Research, New York, New York 10036, United States
| | - Dazhi Tan
- D. E. Shaw Research, New York, New York 10036, United States
| | - Songela Chen
- D. E. Shaw Research, New York, New York 10036, United States
| | - David E Shaw
- D. E. Shaw Research, New York, New York 10036, United States.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, United States
| |
Collapse
|
12
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIII. Proteins 2019; 87:1011-1020. [PMID: 31589781 DOI: 10.1002/prot.25823] [Citation(s) in RCA: 290] [Impact Index Per Article: 58.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 09/25/2019] [Accepted: 09/27/2019] [Indexed: 12/24/2022]
Abstract
CASP (critical assessment of structure prediction) assesses the state of the art in modeling protein structure from amino acid sequence. The most recent experiment (CASP13 held in 2018) saw dramatic progress in structure modeling without use of structural templates (historically "ab initio" modeling). Progress was driven by the successful application of deep learning techniques to predict inter-residue distances. In turn, these results drove dramatic improvements in three-dimensional structure accuracy: With the proviso that there are an adequate number of sequences known for the protein family, the new methods essentially solve the long-standing problem of predicting the fold topology of monomeric proteins. Further, the number of sequences required in the alignment has fallen substantially. There is also substantial improvement in the accuracy of template-based models. Other areas-model refinement, accuracy estimation, and the structure of protein assemblies-have again yielded interesting results. CASP13 placed increased emphasis on the use of sparse data together with modeling and chemical crosslinking, SAXS, and NMR all yielded more mature results. This paper summarizes the key outcomes of CASP13. The special issue of PROTEINS contains papers describing the CASP13 assessments in each modeling category and contributions from the participants.
Collapse
Affiliation(s)
| | - Torsten Schwede
- Biozentrum & SIB Swiss Institute of Bioinformatics, University of Basel, Basel, Switzerland
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, UK
| | | | - John Moult
- Institute for Bioscience and Biotechnology Research, Rockville, Maryland.,Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| |
Collapse
|
13
|
Read RJ, Sammito MD, Kryshtafovych A, Croll TI. Evaluation of model refinement in CASP13. Proteins 2019; 87:1249-1262. [PMID: 31365160 PMCID: PMC6851427 DOI: 10.1002/prot.25794] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/03/2019] [Accepted: 07/27/2019] [Indexed: 12/25/2022]
Abstract
Performance in the model refinement category of the 13th round of Critical Assessment of Structure Prediction (CASP13) is assessed, showing that some groups consistently improve most starting models whereas the majority of participants continue to degrade the starting model on average. Using the ranking formula developed for CASP12, it is shown that only 7 of 32 groups perform better than a “naïve predictor” who just submits the starting model. Common features in their approaches include a dependence on physics‐based force fields to judge alternative conformations and the use of molecular dynamics to relax models to local minima, usually with some restraints to prevent excessively large movements. In addition to the traditional CASP metrics that focus largely on the quality of the overall fold, alternative metrics are evaluated, including comparisons of the main‐chain and side‐chain torsion angles, and the utility of the models for solving crystal structures by the molecular replacement method. It is proposed that the introduction of these metrics, as well as consideration of the accuracy of coordinate error estimates, would improve the discrimination between good and very good models.
Collapse
Affiliation(s)
- Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Massimo D Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| |
Collapse
|
14
|
Shuid AN, Kempster R, McGuffin LJ. ReFOLD: a server for the refinement of 3D protein models guided by accurate quality estimates. Nucleic Acids Res 2019; 45:W422-W428. [PMID: 28402475 PMCID: PMC5570150 DOI: 10.1093/nar/gkx249] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 04/03/2017] [Indexed: 12/29/2022] Open
Abstract
ReFOLD is a novel hybrid refinement server with integrated high performance global and local Accuracy Self Estimates (ASEs). The server attempts to identify and to fix likely errors in user supplied 3D models of proteins via successive rounds of refinement. The server is unique in providing output for multiple alternative refined models in a way that allows users to quickly visualize the key residue locations, which are likely to have been improved. This is important, as global refinement of a full chain model may not always be possible, whereas local regions, or individual domains, can often be much improved. Thus, users may easily compare the specific regions of the alternative refined models in which they are most interested e.g. key interaction sites or domains. ReFOLD was used to generate hundreds of alternative refined models for the CASP12 experiment, boosting our group's performance in the main tertiary structure prediction category. Our successful refinement of initial server models combined with our built-in ASEs were instrumental to our second place ranking on Template Based Modeling (TBM) and Free Modeling (FM)/TBM targets. The ReFOLD server is freely available at: http://www.reading.ac.uk/bioinf/ReFOLD/.
Collapse
Affiliation(s)
- Ahmad N. Shuid
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
- These authors contributed equally to this work as first authors
| | - Robert Kempster
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
- Lancaster Environment Centre, Lancaster University, LA1 1YQ, UK
- These authors contributed equally to this work as first authors
| | - Liam J. McGuffin
- School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK
- To whom correspondence should be addressed. Tel: +44 118 378 6332; Fax: +44 118 378 8106;
| |
Collapse
|
15
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
16
|
Wang A, Zhang Z, Li G. Higher Accuracy Achieved in the Simulations of Protein Structure Refinement, Protein Folding, and Intrinsically Disordered Proteins Using Polarizable Force Fields. J Phys Chem Lett 2018; 9:7110-7116. [PMID: 30514082 DOI: 10.1021/acs.jpclett.8b03471] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The accuracy of molecular mechanics force fields is of vital importance in biomolecular simulations. However, the admittedly more accurate polarizable force fields were recently reported to be less able to reproduce the experimental properties in comparison to additive force fields in some cases. Here, we perform long-time-scale molecular dynamics simulations to systematically evaluate the effect of explicit electronic polarization in polarizable force fields. The results show that the inclusion of electrostatic polarization effect in polarizable force fields can improve their accuracies in protein structure refinement and generate conformational ensembles more approximate to experiments for intrinsically disordered proteins. In contrast, it is difficult for polarizable force fields to approach the native structure, let alone to predict the native state when it is unknown a priori in the real protein structure predictions. We speculate that these effects might be attributed to the preference of protein-water interactions in polarizable force fields.
Collapse
Affiliation(s)
- Anhui Wang
- Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics , Dalian Institute of Chemical Physics, Chinese Academy of Sciences , Dalian 116023 , China
- State Key Laboratory of Fine Chemicals, School of Chemistry , Dalian University of Technology , Dalian 116024 , China
| | - Zhichao Zhang
- State Key Laboratory of Fine Chemicals, School of Chemistry , Dalian University of Technology , Dalian 116024 , China
| | - Guohui Li
- Laboratory of Molecular Modeling and Design, State Key Laboratory of Molecular Reaction Dynamics , Dalian Institute of Chemical Physics, Chinese Academy of Sciences , Dalian 116023 , China
| |
Collapse
|
17
|
Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci U S A 2018; 115:13276-13281. [PMID: 30530696 DOI: 10.1073/pnas.1811364115] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Refinement is the last step in protein structure prediction pipelines to convert approximate homology models to experimental accuracy. Protocols based on molecular dynamics (MD) simulations have shown promise, but current methods are limited to moderate levels of consistent refinement. To explore the energy landscape between homology models and native structures and analyze the challenges of MD-based refinement, eight test cases were studied via extensive simulations followed by Markov state modeling. In all cases, native states were found very close to the experimental structures and at the lowest free energies, but refinement was hindered by a rough energy landscape. Transitions from the homology model to the native states require the crossing of significant kinetic barriers on at least microsecond time scales. A significant energetic driving force toward the native state was lacking until its immediate vicinity, and there was significant sampling of off-pathway states competing for productive refinement. The role of recent force field improvements is discussed and transition paths are analyzed in detail to inform which key transitions have to be overcome to achieve successful refinement.
Collapse
|
18
|
Ma T, Zang T, Wang Q, Ma J. Refining protein structures using enhanced sampling techniques with restraints derived from an ensemble-based model. Protein Sci 2018; 27:1842-1849. [PMID: 30098055 DOI: 10.1002/pro.3486] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 07/05/2018] [Accepted: 07/18/2018] [Indexed: 12/12/2022]
Abstract
This paper reports a method for high-accuracy protein structural refinement, which is a direct extension of the method in our recent publication (Zang, J Chem Phys 2018; 149:072319). It combines a parallel continuous simulated tempering (PCST) method with a temperature-dependent restraint and a blind model selection scheme. In this work, a single-reference-based restraint in previous work was changed to an ensemble-based model (EBM), in which the non-bonded Lennard-Jones term for each contacting atomic pair in previous restraining potential was replaced by a multi-Gaussian function whose parameters are derived from an ensemble of structures such as the ones from various CASP participating groups. The purpose of EBM is to take advantage of partial "correctness" distributed among members of the structural ensemble. Totally 18 targets were refined from the refinement category of CASP10, CASP11 and CASP12. In Top-1 group, 11 out of 18 targets had better models (greater GDT_TS scores) than the CASPR participants. In Top-5 group, nine out of 18 were better. Our results show that PCST-EBM method can considerably improve the low-accuracy structures.
Collapse
Affiliation(s)
- Tianqi Ma
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005
| | - Tianwu Zang
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005
| | - Qinghua Wang
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, 77030
| | - Jianpeng Ma
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005.,Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, 77030
| |
Collapse
|
19
|
Pfeiffenberger E, Bates PA. Predicting improved protein conformations with a temporal deep recurrent neural network. PLoS One 2018; 13:e0202652. [PMID: 30180164 PMCID: PMC6122789 DOI: 10.1371/journal.pone.0202652] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/07/2018] [Indexed: 02/03/2023] Open
Abstract
Accurate protein structure prediction from amino acid sequence is still an unsolved problem. The most reliable methods centre on template based modelling. However, the accuracy of these models entirely depends on the availability of experimentally resolved homologous template structures. In order to generate more accurate models, extensive physics based molecular dynamics (MD) refinement simulations are performed to sample many different conformations to find improved conformational states. In this study, we propose a deep recurrent network model, called DeepTrajectory, that is able to identify these improved conformational states, with high precision, from a variety of different MD based sampling protocols. The proposed model learns the temporal patterns of features computed from MD trajectory data in order to classify whether each recorded simulation snapshot is an improved quality conformational state, decreased quality conformational state or whether there is no perceivable change in state with respect to the starting conformation. The model was trained and tested on 904 trajectories from 42 different protein systems with a cumulative number of more than 1.7 million snapshots. We show that our model outperforms other state of the art machine-learning algorithms that do not consider temporal dependencies. To our knowledge, DeepTrajectory is the first implementation of a time-dependent deep-learning protocol that is re-trainable and able to adapt to any new MD based sampling procedure, thereby demonstrating how a neural network can be used to learn the latter part of the protein folding funnel.
Collapse
Affiliation(s)
- Erik Pfeiffenberger
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| |
Collapse
|
20
|
Wong SWK, Liu JS, Kou SC. Exploring the conformational space for protein folding with sequential Monte Carlo. Ann Appl Stat 2018. [DOI: 10.1214/17-aoas1124] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Deng H, Jia Y, Zhang Y. Protein structure prediction. INTERNATIONAL JOURNAL OF MODERN PHYSICS. B 2018; 32:1840009. [PMID: 30853739 PMCID: PMC6407873 DOI: 10.1142/s021797921840009x] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Predicting 3D structure of protein from its amino acid sequence is one of the most important unsolved problems in biophysics and computational biology. This paper attempts to give a comprehensive introduction of the most recent effort and progress on protein structure prediction. Following the general flowchart of structure prediction, related concepts and methods are presented and discussed. Moreover, brief introductions are made to several widely-used prediction methods and the community-wide critical assessment of protein structure prediction (CASP) experiments.
Collapse
Affiliation(s)
- Haiyou Deng
- College of Science, Huazhong Agricultural University, Wuhan 4R0070, P. R. China
| | - Ya Jia
- College of Physical Science and Technology, Central China Normal University, Wuhan 430079, P. R. China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA
| |
Collapse
|
22
|
Abstract
A half century of studying protein folding in vitro and modeling it in silico has not provided us with a reliable computational method to predict the native conformations of proteins de novo, let alone identify the intermediates on their folding pathways. In this Opinion article, we suggest that the reason for this impasse is the over-reliance on current physical models of protein folding that are based on the assumption that proteins are able to fold spontaneously without assistance. These models arose from studies conducted in vitro on a biased sample of smaller, easier-to-isolate proteins, whose native structures appear to be thermodynamically stable. Meanwhile, the vast empirical data on the majority of larger proteins suggests that once these proteins are completely denatured in vitro, they cannot fold into native conformations without assistance. Moreover, they tend to lose their native conformations spontaneously and irreversibly in vitro, and therefore such conformations must be metastable. We propose a model of protein folding that is based on the notion that the folding of all proteins in the cell is mediated by the actions of the "protein folding machine" that includes the ribosome, various chaperones, and other components involved in co-translational or post-translational formation, maintenance and repair of protein native conformations in vivo. The most important and universal component of the protein folding machine consists of the ribosome in complex with the welcoming committee chaperones. The concerted actions of molecular machinery in the ribosome peptidyl transferase center, in the exit tunnel, and at the surface of the ribosome result in the application of mechanical and other forces to the nascent peptide, reducing its conformational entropy and possibly creating strain in the peptide backbone. The resulting high-energy conformation of the nascent peptide allows it to fold very fast and to overcome high kinetic barriers along the folding pathway. The early folding intermediates in vivo are stabilized by interactions with the ribosome and welcoming committee chaperones and would not be able to exist in vitro in the absence of such cellular components. In vitro experiments that unfold proteins by heat or chemical treatment produce denaturation ensembles that are very different from folding intermediates in vivo and therefore have very limited use in reconstructing the in vivo folding pathways. We conclude that computational modeling of protein folding should deemphasize the notion of unassisted thermodynamically controlled folding, and should focus instead on the step-by-step reverse engineering of the folding process as it actually occurs in vivo. REVIEWERS This article was reviewed by Eugene Koonin and Frank Eisenhaber.
Collapse
|
23
|
Dutagaci B, Heo L, Feig M. Structure refinement of membrane proteins via molecular dynamics simulations. Proteins 2018; 86:738-750. [PMID: 29675899 PMCID: PMC6013386 DOI: 10.1002/prot.25508] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 04/09/2018] [Accepted: 04/14/2018] [Indexed: 12/12/2022]
Abstract
A refinement protocol based on physics-based techniques established for water soluble proteins is tested for membrane protein structures. Initial structures were generated by homology modeling and sampled via molecular dynamics simulations in explicit lipid bilayer and aqueous solvent systems. Snapshots from the simulations were selected based on scoring with either knowledge-based or implicit membrane-based scoring functions and averaged to obtain refined models. The protocol resulted in consistent and significant refinement of the membrane protein structures similar to the performance of refinement methods for soluble proteins. Refinement success was similar between sampling in the presence of lipid bilayers and aqueous solvent but the presence of lipid bilayers may benefit the improvement of lipid-facing residues. Scoring with knowledge-based functions (DFIRE and RWplus) was found to be as good as scoring using implicit membrane-based scoring functions suggesting that differences in internal packing is more important than orientations relative to the membrane during the refinement of membrane protein homology models.
Collapse
Affiliation(s)
- Bercem Dutagaci
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
24
|
Park H, Ovchinnikov S, Kim DE, DiMaio F, Baker D. Protein homology model refinement by large-scale energy optimization. Proc Natl Acad Sci U S A 2018; 115:3054-3059. [PMID: 29507254 PMCID: PMC5866580 DOI: 10.1073/pnas.1719115115] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Proteins fold to their lowest free-energy structures, and hence the most straightforward way to increase the accuracy of a partially incorrect protein structure model is to search for the lowest-energy nearby structure. This direct approach has met with little success for two reasons: first, energy function inaccuracies can lead to false energy minima, resulting in model degradation rather than improvement; and second, even with an accurate energy function, the search problem is formidable because the energy only drops considerably in the immediate vicinity of the global minimum, and there are a very large number of degrees of freedom. Here we describe a large-scale energy optimization-based refinement method that incorporates advances in both search and energy function accuracy that can substantially improve the accuracy of low-resolution homology models. The method refined low-resolution homology models into correct folds for 50 of 84 diverse protein families and generated improved models in recent blind structure prediction experiments. Analyses of the basis for these improvements reveal contributions from both the improvements in conformational sampling techniques and the energy function.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, WA 98105
- Institute for Protein Design, University of Washington, Seattle, WA 98105
| | - Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA 98105
- Institute for Protein Design, University of Washington, Seattle, WA 98105
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98105
| | - David E Kim
- Institute for Protein Design, University of Washington, Seattle, WA 98105
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98105
- Institute for Protein Design, University of Washington, Seattle, WA 98105
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98105;
- Institute for Protein Design, University of Washington, Seattle, WA 98105
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105
| |
Collapse
|
25
|
Heo L, Feig M. PREFMD: a web server for protein structure refinement via molecular dynamics simulations. Bioinformatics 2018; 34:1063-1065. [PMID: 29126101 PMCID: PMC5860225 DOI: 10.1093/bioinformatics/btx726] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/04/2017] [Accepted: 11/07/2017] [Indexed: 11/13/2022] Open
Abstract
Summary Refinement of protein structure models is a long-standing problem in structural bioinformatics. Molecular dynamics-based methods have emerged as an avenue to achieve consistent refinement. The PREFMD web server implements an optimized protocol based on the method successfully tested in CASP11. Validation with recent CASP refinement targets shows consistent and more significant improvement in global structure accuracy over other state-of-the-art servers. Availability and implementation PREFMD is freely available as a web server at http://feiglab.org/prefmd. Scripts for running PREFMD as a stand-alone package are available at https://github.com/feiglab/prefmd.git. Contact feig@msu.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lim Heo
- Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Michael Feig
- Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
26
|
Heo L, Feig M. What makes it difficult to refine protein models further via molecular dynamics simulations? Proteins 2018; 86 Suppl 1:177-188. [PMID: 28975670 PMCID: PMC5820117 DOI: 10.1002/prot.25393] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 09/11/2017] [Accepted: 09/29/2017] [Indexed: 01/20/2023]
Abstract
Protein structure refinement remains a challenging yet important problem as it has the potential to bring already accurate template-based models to near-native resolution. Refinement based on molecular dynamics simulations has been a highly promising approach and the performance of MD-based refinement in the Feig group during CASP12 is described here. During CASP12, sampling was extended well into the microsecond scale, an improved force field was applied, and new protocol variations were tested. Progress over previous rounds of CASP was found to be limited which is analyzed in terms of the quality of the initial models and dependency on the amount of sampling and refinement protocol variations. As current MD-based refinement protocols appear to be reaching a plateau, detailed analysis is presented to provide new insight into the major challenges towards more extensive structure refinement, focusing in particular on sampling with and without restraints.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
27
|
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 2018; 86 Suppl 1:7-15. [PMID: 29082672 PMCID: PMC5897042 DOI: 10.1002/prot.25415] [Citation(s) in RCA: 245] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 10/25/2017] [Accepted: 10/27/2017] [Indexed: 12/24/2022]
Abstract
This article reports the outcome of the 12th round of Critical Assessment of Structure Prediction (CASP12), held in 2016. CASP is a community experiment to determine the state of the art in modeling protein structure from amino acid sequence. Participants are provided sequence information and in turn provide protein structure models and related information. Analysis of the submitted structures by independent assessors provides a comprehensive picture of the capabilities of current methods, and allows progress to be identified. This was again an exciting round of CASP, with significant advances in 4 areas: (i) The use of new methods for predicting three-dimensional contacts led to a two-fold improvement in contact accuracy. (ii) As a consequence, model accuracy for proteins where no template was available improved dramatically. (iii) Models based on a structural template showed overall improvement in accuracy. (iv) Methods for estimating the accuracy of a model continued to improve. CASP continued to develop new areas: (i) Assessing methods for building quaternary structure models, including an expansion of the collaboration between CASP and CAPRI. (ii) Modeling with the aid of experimental data was extended to include SAXS data, as well as again using chemical cross-linking information. (iii) A team of assessors evaluated the suitability of models for a range of applications, including mutation interpretation, analysis of ligand binding properties, and identification of interfaces. This article describes the experiment and summarizes the results. The rest of this special issue of PROTEINS contains papers describing CASP12 results and assessments in more detail.
Collapse
Affiliation(s)
- John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics and Istituto Pasteur - Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le Aldo Moro, 5, 00185 Rome, Italy
| |
Collapse
|
28
|
Hovan L, Oleinikovas V, Yalinca H, Kryshtafovych A, Saladino G, Gervasio FL. Assessment of the model refinement category in CASP12. Proteins 2017; 86 Suppl 1:152-167. [PMID: 29071750 DOI: 10.1002/prot.25409] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 10/03/2017] [Accepted: 10/24/2017] [Indexed: 01/07/2023]
Abstract
We here report on the assessment of the model refinement predictions submitted to the 12th Experiment on the Critical Assessment of Protein Structure Prediction (CASP12). This is the fifth refinement experiment since CASP8 (2008) and, as with the previous experiments, the predictors were invited to refine selected server models received in the regular (nonrefinement) stage of the CASP experiment. We assessed the submitted models using a combination of standard CASP measures. The coefficients for the linear combination of Z-scores (the CASP12 score) have been obtained by a machine learning algorithm trained on the results of visual inspection. We identified eight groups that improve both the backbone conformation and the side chain positioning for the majority of targets. Albeit the top methods adopted distinctively different approaches, their overall performance was almost indistinguishable, with each of them excelling in different scores or target subsets. What is more, there were a few novel approaches that, while doing worse than average in most cases, provided the best refinements for a few targets, showing significant latitude for further innovation in the field.
Collapse
Affiliation(s)
- Ladislav Hovan
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Havva Yalinca
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Giorgio Saladino
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | - Francesco Luigi Gervasio
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom.,Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, United Kingdom
| |
Collapse
|
29
|
Lee GR, Heo L, Seok C. Simultaneous refinement of inaccurate local regions and overall structure in the CASP12 protein model refinement experiment. Proteins 2017; 86 Suppl 1:168-176. [PMID: 29044810 DOI: 10.1002/prot.25404] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 10/09/2017] [Accepted: 10/11/2017] [Indexed: 12/15/2022]
Abstract
Advances in protein model refinement techniques are required as diverse sources of protein structure information are available from low-resolution experiments or informatics-based computations such as cryo-EM, NMR, homology models, or predicted residue contacts. Given semi-reliable or incomplete structural information, structure quality of a protein model has to be improved by ab initio methods such as energy-based simulation. In this study, we describe a new automatic refinement server method designed to improve locally inaccurate regions and overall structure simultaneously. Locally inaccurate regions may occur in protein structures due to non-convergent or missing information in template structures used in homology modeling or due to intrinsic structural flexibilities not resolved by experimental techniques. However, such variable or dynamic regions often play important functional roles by participating in interactions with other biomolecules or in transitions between different functional states. The new refinement method introduced here utilizes diverse types of geometric operators which drive both local and global changes, and the effect of structure changes and relaxations are accumulated. This resulted in consistent refinement of both local and global structural features. Performance of this method in CASP12 is discussed.
Collapse
Affiliation(s)
- Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
30
|
Terashi G, Kihara D. Protein structure model refinement in CASP12 using short and long molecular dynamics simulations in implicit solvent. Proteins 2017; 86 Suppl 1:189-201. [PMID: 28833585 DOI: 10.1002/prot.25373] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 08/01/2017] [Accepted: 08/18/2017] [Indexed: 12/21/2022]
Abstract
Protein structure prediction has matured over years, particularly those which use structure templates for building a model. It can build a model with correct overall conformation in cases where appropriate templates are available. Models with the correct topology can be practically useful for limited purposes that need residue-level accuracy, but further improvement of the models can allow the models to be used in tasks that need detailed structures, such as molecular replacement in X-ray crystallography or structure-based drug screening. Thus, model refinement is an important final step in protein structure prediction to bridge predictions to real-life applications. Model refinement is one of the categories in recent rounds of critical assessment of techniques in protein structure prediction (CASP) and has recently been drawing more attention due to its realized importance. Here we report our group's performance in the refinement category in CASP12. Our method is based on inexpensive short molecular dynamics (MD) simulations in implicit solvent. Our performance in CASP12 was among the top, which was consistent with the previous round, CASP11. Our method with short MD runs achieved comparable performance with other methods that used longer simulations. Detailed analyses found that improvements typically occurred in entire regions of a structure rather than only in flexible loop regions. The remaining challenge in the structure refinement includes large conformational refinement which involves substantial motions of secondary structure elements or domains.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907.,Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907
| |
Collapse
|
31
|
Feig M. Computational protein structure refinement: Almost there, yet still so far to go. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2017; 7:e1307. [PMID: 30613211 PMCID: PMC6319934 DOI: 10.1002/wcms.1307] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein structures are essential in modern biology yet experimental methods are far from being able to catch up with the rapid increase in available genomic data. Computational protein structure prediction methods aim to fill the gap while the role of protein structure refinement is to take approximate initial template-based models and bring them closer to the true native structure. Current methods for computational structure refinement rely on molecular dynamics simulations, related sampling methods, or iterative structure optimization protocols. The best methods are able to achieve moderate degrees of refinement but consistent refinement that can reach near-experimental accuracy remains elusive. Key issues revolve around the accuracy of the energy function, the inability to reliably rank multiple models, and the use of restraints that keep sampling close to the native state but also limit the degree of possible refinement. A different aspect is the question of what exactly the target of high-resolution refinement should be as experimental structures are affected by experimental conditions and different biological questions require varying levels of accuracy. While improvement of the global protein structure is a difficult problem, high-resolution refinement methods that improves local structural quality such as favorable stereochemistry and the avoidance of atomic clashes are much more successful.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, 603 Wilson Rd., Room 218 BCH, East Lansing, MI, USA, ; 517-432-7439
| |
Collapse
|
32
|
Variability of Protein Structure Models from Electron Microscopy. Structure 2017; 25:592-602.e2. [PMID: 28262392 DOI: 10.1016/j.str.2017.02.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 01/10/2017] [Accepted: 02/11/2017] [Indexed: 11/23/2022]
Abstract
An increasing number of biomolecular structures are solved by electron microscopy (EM). However, the quality of structure models determined from EM maps vary substantially. To understand to what extent structure models are supported by information embedded in EM maps, we used two computational structure refinement methods to examine how much structures can be refined using a dataset of 49 maps with accompanying structure models. The extent of structure modification as well as the disagreement between refinement models produced by the two computational methods scaled inversely with the global and the local map resolutions. A general quantitative estimation of deviations of structures for particular map resolutions are provided. Our results indicate that the observed discrepancy between the deposited map and the refined models is due to the lack of structural information present in EM maps and thus these annotations must be used with caution for further applications.
Collapse
|
33
|
Gadzała M, Kalinowska B, Banach M, Konieczny L, Roterman I. Determining protein similarity by comparing hydrophobic core structure. Heliyon 2017; 3:e00235. [PMID: 28217749 PMCID: PMC5300504 DOI: 10.1016/j.heliyon.2017.e00235] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 12/06/2016] [Accepted: 01/19/2017] [Indexed: 12/19/2022] Open
Abstract
Formal assessment of structural similarity is - next to protein structure prediction - arguably the most important unsolved problem in proteomics. In this paper we propose a similarity criterion based on commonalities between the proteins' hydrophobic cores. The hydrophobic core emerges as a result of conformational changes through which each residue reaches its intended position in the protein body. A quantitative criterion based on this phenomenon has been proposed in the framework of the CASP challenge. The structure of the hydrophobic core - including the placement and scope of any deviations from the idealized model - may indirectly point to areas of importance from the point of view of the protein's biological function. Our analysis focuses on an arbitrarily selected target from the CASP11 challenge. The proposed measure, while compliant with CASP criteria (70-80% correlation), involves certain adjustments which acknowledge the presence of factors other than simple spatial arrangement of solids.
Collapse
Affiliation(s)
- M. Gadzała
- AGH - Academic Computer Center − Cyfronet, Nawojki 11, Kraków 30-950, Poland
| | - B. Kalinowska
- Faculty of Physics, Astronomy, Applied Computer Science − Jagiellonian University, Łojasiewicza 11, Kraków 30-348, Poland
| | - M. Banach
- Department of Bioinformatics and Telemedicine, Jagiellonian University − Medical College, Łazarza 16, Krakow 31-530, Poland
| | - L. Konieczny
- Chair of Medical Biochemistry, Jagiellonian University − Medical College, Kopernika 7, Kraków 31-034, Poland
| | - I. Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University − Medical College, Łazarza 16, Krakow 31-530, Poland
| |
Collapse
|
34
|
Skolnick J, Zhou H. Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods? J Phys Chem B 2016; 121:3546-3554. [PMID: 27748116 DOI: 10.1021/acs.jpcb.6b09517] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| |
Collapse
|
35
|
Heo L, Lee H, Seok C. GalaxyRefineComplex: Refinement of protein-protein complex model structures driven by interface repacking. Sci Rep 2016; 6:32153. [PMID: 27535582 PMCID: PMC4989233 DOI: 10.1038/srep32153] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 08/03/2016] [Indexed: 12/13/2022] Open
Abstract
Protein-protein docking methods have been widely used to gain an atomic-level understanding of protein interactions. However, docking methods that employ low-resolution energy functions are popular because of computational efficiency. Low-resolution docking tends to generate protein complex structures that are not fully optimized. GalaxyRefineComplex takes such low-resolution docking structures and refines them to improve model accuracy in terms of both interface contact and inter-protein orientation. This refinement method allows flexibility at the protein interface and in the overall docking structure to capture conformational changes that occur upon binding. Symmetric refinement is also provided for symmetric homo-complexes. This method was validated by refining models produced by available docking programs, including ZDOCK and M-ZDOCK, and was successfully applied to CAPRI targets in a blind fashion. An example of using the refinement method with an existing docking method for ligand binding mode prediction of a drug target is also presented. A web server that implements the method is freely available at http://galaxy.seoklab.org/refinecomplex.
Collapse
Affiliation(s)
- Lim Heo
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Hasup Lee
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|