1
|
Adiyaman R, McGuffin LJ. Using Local Protein Model Quality Estimates to Guide a Molecular Dynamics-Based Refinement Strategy. Methods Mol Biol 2023; 2627:119-140. [PMID: 36959445 DOI: 10.1007/978-1-0716-2974-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The refinement of predicted 3D models aims to bring them closer to the native structure by fixing errors including unusual bonds and torsion angles and irregular hydrogen bonding patterns. Refinement approaches based on molecular dynamics (MD) simulations using different types of restraints have performed well since CASP10. ReFOLD, developed by the McGuffin group, was one of the many MD-based refinement approaches, which were tested in CASP 12. When the performance of the ReFOLD method in CASP12 was evaluated, it was observed that ReFOLD suffered from the absence of a reliable guidance mechanism to reach consistent improvement for the quality of predicted 3D models, particularly in the case of template-based modelling (TBM) targets. Therefore, here we propose to utilize the local quality assessment score produced by ModFOLD6 to guide the MD-based refinement approach to further increase the accuracy of the predicted 3D models. The relative performance of the new local quality assessment guided MD-based refinement protocol and the original MD-based protocol ReFOLD are compared utilizing many different official scoring methods. By using the per-residue accuracy (or local quality) score to guide the refinement process, we are able to prevent the refined models from undesired structural deviations, thereby leading to more consistent improvements. This chapter will include a detailed analysis of the performance of the local quality assessment guided MD-based protocol versus that deployed in the original ReFOLD method.
Collapse
Affiliation(s)
- Recep Adiyaman
- School of Biological Sciences, University of Reading, Reading, UK
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
2
|
Bhattacharya D. refineD: improved protein structure refinement using machine learning based restrained relaxation. Bioinformatics 2020; 35:3320-3328. [PMID: 30759180 DOI: 10.1093/bioinformatics/btz101] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 01/22/2019] [Accepted: 02/11/2019] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Protein structure refinement aims to bring moderately accurate template-based protein models closer to the native state through conformational sampling. However, guiding the sampling towards the native state by effectively using restraints remains a major issue in structure refinement. RESULTS Here, we develop a machine learning based restrained relaxation protocol that uses deep discriminative learning based binary classifiers to predict multi-resolution probabilistic restraints from the starting structure and subsequently converts these restraints to be integrated into Rosetta all-atom energy function as additional scoring terms during structure refinement. We use four restraint resolutions as adopted in GDT-HA (0.5, 1, 2 and 4 Å), centered on the Cα atom of each residue that are predicted by ensemble of four deep discriminative classifiers trained using combinations of sequence and structure-derived features as well as several energy terms from Rosetta centroid scoring function. The proposed method, refineD, has been found to produce consistent and substantial structural refinement through the use of cumulative and non-cumulative restraints on 150 benchmarking targets. refineD outperforms unrestrained relaxation strategy or relaxation that is restrained to starting structures using the FastRelax application of Rosetta or atomic-level energy minimization based ModRefiner method as well as molecular dynamics (MD) simulation based FG-MD protocol. Furthermore, by adjusting restraint resolutions, the method addresses the tradeoff that exists between degree and consistency of refinement. These results demonstrate a promising new avenue for improving accuracy of template-based protein models by effectively guiding conformational sampling during structure refinement through the use of machine learning based restraints. AVAILABILITY AND IMPLEMENTATION http://watson.cse.eng.auburn.edu/refineD/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
3
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
4
|
Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci U S A 2018; 115:13276-13281. [PMID: 30530696 DOI: 10.1073/pnas.1811364115] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Refinement is the last step in protein structure prediction pipelines to convert approximate homology models to experimental accuracy. Protocols based on molecular dynamics (MD) simulations have shown promise, but current methods are limited to moderate levels of consistent refinement. To explore the energy landscape between homology models and native structures and analyze the challenges of MD-based refinement, eight test cases were studied via extensive simulations followed by Markov state modeling. In all cases, native states were found very close to the experimental structures and at the lowest free energies, but refinement was hindered by a rough energy landscape. Transitions from the homology model to the native states require the crossing of significant kinetic barriers on at least microsecond time scales. A significant energetic driving force toward the native state was lacking until its immediate vicinity, and there was significant sampling of off-pathway states competing for productive refinement. The role of recent force field improvements is discussed and transition paths are analyzed in detail to inform which key transitions have to be overcome to achieve successful refinement.
Collapse
|
5
|
Feig M. Computational protein structure refinement: Almost there, yet still so far to go. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2017; 7:e1307. [PMID: 30613211 PMCID: PMC6319934 DOI: 10.1002/wcms.1307] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein structures are essential in modern biology yet experimental methods are far from being able to catch up with the rapid increase in available genomic data. Computational protein structure prediction methods aim to fill the gap while the role of protein structure refinement is to take approximate initial template-based models and bring them closer to the true native structure. Current methods for computational structure refinement rely on molecular dynamics simulations, related sampling methods, or iterative structure optimization protocols. The best methods are able to achieve moderate degrees of refinement but consistent refinement that can reach near-experimental accuracy remains elusive. Key issues revolve around the accuracy of the energy function, the inability to reliably rank multiple models, and the use of restraints that keep sampling close to the native state but also limit the degree of possible refinement. A different aspect is the question of what exactly the target of high-resolution refinement should be as experimental structures are affected by experimental conditions and different biological questions require varying levels of accuracy. While improvement of the global protein structure is a difficult problem, high-resolution refinement methods that improves local structural quality such as favorable stereochemistry and the avoidance of atomic clashes are much more successful.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, 603 Wilson Rd., Room 218 BCH, East Lansing, MI, USA, ; 517-432-7439
| |
Collapse
|
6
|
Pang YP. FF12MC: A revised AMBER forcefield and new protein simulation protocol. Proteins 2016; 84:1490-516. [PMID: 27348292 PMCID: PMC5129589 DOI: 10.1002/prot.25094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/16/2016] [Accepted: 06/18/2016] [Indexed: 12/25/2022]
Abstract
Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yuan-Ping Pang
- Computer-Aided Molecular Design Laboratory, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
7
|
Brookes DH, Head-Gordon T. Experimental Inferential Structure Determination of Ensembles for Intrinsically Disordered Proteins. J Am Chem Soc 2016; 138:4530-8. [PMID: 26967199 DOI: 10.1021/jacs.6b00351] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We develop a Bayesian approach to determine the most probable structural ensemble model from candidate structures for intrinsically disordered proteins (IDPs) that takes full advantage of NMR chemical shifts and J-coupling data, their known errors and variances, and the quality of the theoretical back-calculation from structure to experimental observables. Our approach differs from previous formulations in the optimization of experimental and back-calculation nuisance parameters that are treated as random variables with known distributions, as opposed to structural or ensemble weight optimization or use of a reference ensemble. The resulting experimental inferential structure determination (EISD) method is size extensive with O(N) scaling, with N = number of structures, that allows for the rapid ranking of large ensemble data comprising tens of thousands of conformations. We apply the EISD approach on singular folded proteins and a corresponding set of ∼25 000 misfolded states to illustrate the problems that can arise using Boltzmann weighted priors. We then apply the EISD method to rank IDP ensembles most consistent with the NMR data and show that the primary error for ranking or creating good IDP ensembles resides in the poor back-calculation from structure to simulated experimental observable. We show that a reduction by a factor of 3 in the uncertainty of the back-calculation error can improve the discrimination among qualitatively different IDP ensembles for the amyloid-beta peptide.
Collapse
Affiliation(s)
- David H Brookes
- Department of Chemistry, ‡Department of Bioengineering, §Department of Chemical and Biomolecular Engineering, ∥Chemical Sciences Division, Lawrence Berkeley National Laboratory, University of California , Berkeley, California 94720, United States
| | - Teresa Head-Gordon
- Department of Chemistry, ‡Department of Bioengineering, §Department of Chemical and Biomolecular Engineering, ∥Chemical Sciences Division, Lawrence Berkeley National Laboratory, University of California , Berkeley, California 94720, United States
| |
Collapse
|
8
|
Bhowmick A, Sharma SC, Honma H, Head-Gordon T. The role of side chain entropy and mutual information for improving the de novo design of Kemp eliminases KE07 and KE70. Phys Chem Chem Phys 2016; 18:19386-96. [DOI: 10.1039/c6cp03622h] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Side chain entropy and mutual entropy information between residue pairs have been calculated for two de novo designed Kemp eliminase enzymes, KE07 and KE70, and for their most improved versions at the end of laboratory directed evolution (LDE).
Collapse
Affiliation(s)
- Asmit Bhowmick
- Department of Chemical and Biomolecular Engineering
- University of California Berkeley
- Berkeley
- USA
| | - Sudhir C. Sharma
- Department of Chemistry
- University of California Berkeley
- Berkeley
- USA
| | - Hallie Honma
- Department of Bioengineering, University of California Berkeley
- Berkeley
- USA
| | - Teresa Head-Gordon
- Department of Chemical and Biomolecular Engineering
- University of California Berkeley
- Berkeley
- USA
- Department of Chemistry
| |
Collapse
|
9
|
Xun S, Jiang F, Wu YD. Significant Refinement of Protein Structure Models Using a Residue-Specific Force Field. J Chem Theory Comput 2015; 11:1949-56. [PMID: 26574396 DOI: 10.1021/acs.jctc.5b00029] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
An important application of all-atom explicit-solvent molecular dynamics (MD) simulations is the refinement of protein structures from low-resolution experiments or template-based modeling. A critical requirement is that the native structure is stable with the force field. We have applied a recently developed residue-specific force field, RSFF1, to a set of 30 refinement targets from recent CASP experiments. Starting from their experimental structures, 1.0 μs unrestrained simulations at 298 K retain most of the native structures quite well except for a few flexible terminals and long internal loops. Starting from each homology model, a 150 ns MD simulation at 380 K generates the best RMSD improvement of 0.85 Å on average. The structural improvements roughly correlate with the RMSD of the initial homology models, indicating possible consistent structure refinement. Finally, targets TR614 and TR624 have been subjected to long-time replica-exchange MD simulations. Significant structural improvements are generated, with RMSD of 1.91 and 1.36 Å with respect to their crystal structures. Thus, it is possible to achieve realistic refinement of protein structure models to near-experimental accuracy, using accurate force field with sufficient conformational sampling.
Collapse
Affiliation(s)
- Sangni Xun
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Yun-Dong Wu
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China.,College of Chemistry and Molecular Engineering, Peking University , Beijing, 100871, China
| |
Collapse
|
10
|
Bhowmick A, Head-Gordon T. A Monte Carlo Method for Generating Side Chain Structural Ensembles. Structure 2015; 23:44-55. [DOI: 10.1016/j.str.2014.10.011] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Revised: 10/20/2014] [Accepted: 10/21/2014] [Indexed: 11/29/2022]
|
11
|
Jayaram B, Dhingra P, Mishra A, Kaushik R, Mukherjee G, Singh A, Shekhar S. Bhageerath-H: a homology/ab initio hybrid server for predicting tertiary structures of monomeric soluble proteins. BMC Bioinformatics 2014; 15 Suppl 16:S7. [PMID: 25521245 PMCID: PMC4290660 DOI: 10.1186/1471-2105-15-s16-s7] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The advent of human genome sequencing project has led to a spurt in the number of protein sequences in the databanks. Success of structure based drug discovery severely hinges on the availability of structures. Despite significant progresses in the area of experimental protein structure determination, the sequence-structure gap is continually widening. Data driven homology based computational methods have proved successful in predicting tertiary structures for sequences sharing medium to high sequence similarities. With dwindling similarities of query sequences, advanced homology/ ab initio hybrid approaches are being explored to solve structure prediction problem. Here we describe Bhageerath-H, a homology/ ab initio hybrid software/server for predicting protein tertiary structures with advancing drug design attempts as one of the goals. RESULTS Bhageerath-H web-server was validated on 75 CASP10 targets which showed TM-scores ≥ 0.5 in 91% of the cases and Cα RMSDs ≤ 5 Å from the native in 58% of the targets, which is well above the CASP10 water mark. Comparison with some leading servers demonstrated the uniqueness of the hybrid methodology in effectively sampling conformational space, scoring best decoys and refining low resolution models to high and medium resolution. CONCLUSION Bhageerath-H methodology is web enabled for the scientific community as a freely accessible web server. The methodology is fielded in the on-going CASP11 experiment.
Collapse
|
12
|
Ryu H, Kim TR, Ahn S, Ji S, Lee J. Protein NMR structures refined without NOE data. PLoS One 2014; 9:e108888. [PMID: 25279564 PMCID: PMC4184813 DOI: 10.1371/journal.pone.0108888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 09/04/2014] [Indexed: 12/31/2022] Open
Abstract
The refinement of low-quality structures is an important challenge in protein structure prediction. Many studies have been conducted on protein structure refinement; the refinement of structures derived from NMR spectroscopy has been especially intensively studied. In this study, we generated flat-bottom distance potential instead of NOE data because NOE data have ambiguity and uncertainty. The potential was derived from distance information from given structures and prevented structural dislocation during the refinement process. A simulated annealing protocol was used to minimize the potential energy of the structure. The protocol was tested on 134 NMR structures in the Protein Data Bank (PDB) that also have X-ray structures. Among them, 50 structures were used as a training set to find the optimal "width" parameter in the flat-bottom distance potential functions. In the validation set (the other 84 structures), most of the 12 quality assessment scores of the refined structures were significantly improved (total score increased from 1.215 to 2.044). Moreover, the secondary structure similarity of the refined structure was improved over that of the original structure. Finally, we demonstrate that the combination of two energy potentials, statistical torsion angle potential (STAP) and the flat-bottom distance potential, can drive the refinement of NMR structures.
Collapse
Affiliation(s)
- Hyojung Ryu
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| | - Tae-Rae Kim
- Department of Chemistry, Seoul National University, Seoul, The Republic of Korea
| | - SeonJoo Ahn
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
| | - Sunyoung Ji
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| | - Jinhyuk Lee
- Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology, Daejeon, The Republic of Korea
- Department of Bioinformatics, University of Science and Technology, Daejeon, The Republic of Korea
| |
Collapse
|
13
|
Nugent T, Cozzetto D, Jones DT. Evaluation of predictions in the CASP10 model refinement category. Proteins 2014; 82 Suppl 2:98-111. [PMID: 23900810 PMCID: PMC4282348 DOI: 10.1002/prot.24377] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Revised: 06/19/2013] [Accepted: 06/28/2013] [Indexed: 12/24/2022]
Abstract
Here we report on the assessment results of the third experiment to evaluate the state of the art in protein model refinement, where participants were invited to improve the accuracy of initial protein models for 27 targets. Using an array of complementary evaluation measures, we find that five groups performed better than the naïve (null) method—a marked improvement over CASP9, although only three were significantly better. The leading groups also demonstrated the ability to consistently improve both backbone and side chain positioning, while other groups reliably enhanced other aspects of protein physicality. The top-ranked group succeeded in improving the backbone conformation in almost 90% of targets, suggesting a strategy that for the first time in CASP refinement is successful in a clear majority of cases. A number of issues remain unsolved: the majority of groups still fail to improve the quality of the starting models; even successful groups are only able to make modest improvements; and no prediction is more similar to the native structure than to the starting model. Successful refinement attempts also often go unrecognized, as suggested by the relatively larger improvements when predictions not submitted as model 1 are also considered. Proteins 2014; 82(Suppl 2):98–111.
Collapse
Affiliation(s)
- Timothy Nugent
- Department of Computer Science Bioinformatics Group, University College London, London, WC1E 6BT, United Kingdom
| | | | | |
Collapse
|
14
|
Petrella RJ. OPTIMIZATION BIAS IN ENERGY-BASED STRUCTURE PREDICTION. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2013; 12:1341014. [PMID: 25552783 PMCID: PMC4278582 DOI: 10.1142/s0219633613410149] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Physics-based computational approaches to predicting the structure of macromolecules such as proteins are gaining increased use, but there are remaining challenges. In the current work, it is demonstrated that in energy-based prediction methods, the degree of optimization of the sampled structures can influence the prediction results. In particular, discrepancies in the degree of local sampling can bias the predictions in favor of the oversampled structures by shifting the local probability distributions of the minimum sampled energies. In simple systems, it is shown that the magnitude of the errors can be calculated from the energy surface, and for certain model systems, derived analytically. Further, it is shown that for energy wells whose forms differ only by a randomly assigned energy shift, the optimal accuracy of prediction is achieved when the sampling around each structure is equal. Energy correction terms can be used in cases of unequal sampling to reproduce the total probabilities that would occur under equal sampling, but optimal corrections only partially restore the prediction accuracy lost to unequal sampling. For multiwell systems, the determination of the correction terms is a multibody problem; it is shown that the involved cross-correlation multiple integrals can be reduced to simpler integrals. The possible implications of the current analysis for macromolecular structure prediction are discussed.
Collapse
Affiliation(s)
- Robert J. Petrella
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| |
Collapse
|
15
|
Mirjalili V, Feig M. Protein Structure Refinement through Structure Selection and Averaging from Molecular Dynamics Ensembles. J Chem Theory Comput 2013; 9:1294-1303. [PMID: 23526422 PMCID: PMC3603382 DOI: 10.1021/ct300962x] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
A molecular dynamics (MD) simulation based protocol for structure refinement of template-based model predictions is described. The protocol involves the application of restraints, ensemble averaging of selected subsets, interpolation between initial and refined structures, and assessment of refinement success. It is found that sub-microsecond MD-based sampling when combined with ensemble averaging can produce moderate but consistent refinement for most systems in the CASP targets considered here.
Collapse
Affiliation(s)
- Vahid Mirjalili
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Mechanical Engineering, Michigan State University, East Lansing, MI 48824; USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824; USA
- Department of Chemistry, Michigan State University, East Lansing, MI 48824; USA
| |
Collapse
|
16
|
Chitsaz M, Mayo SL. GRID: a high-resolution protein structure refinement algorithm. J Comput Chem 2012; 34:445-50. [PMID: 23065773 DOI: 10.1002/jcc.23151] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 07/31/2012] [Accepted: 08/27/2012] [Indexed: 12/27/2022]
Abstract
The energy-based refinement of protein structures generated by fold prediction algorithms to atomic-level accuracy remains a major challenge in structural biology. Energy-based refinement is mainly dependent on two components: (1) sufficiently accurate force fields, and (2) efficient conformational space search algorithms. Focusing on the latter, we developed a high-resolution refinement algorithm called GRID. It takes a three-dimensional protein structure as input and, using an all-atom force field, attempts to improve the energy of the structure by systematically perturbing backbone dihedrals and side-chain rotamer conformations. We compare GRID to Backrub, a stochastic algorithm that has been shown to predict a significant fraction of the conformational changes that occur with point mutations. We applied GRID and Backrub to 10 high-resolution (≤ 2.8 Å) crystal structures from the Protein Data Bank and measured the energy improvements obtained and the computation times required to achieve them. GRID resulted in energy improvements that were significantly better than those attained by Backrub while expending about the same amount of computational resources. GRID resulted in relaxed structures that had slightly higher backbone RMSDs compared to Backrub relative to the starting crystal structures. The average RMSD was 0.25 ± 0.02 Å for GRID versus 0.14 ± 0.04 Å for Backrub. These relatively minor deviations indicate that both algorithms generate structures that retain their original topologies, as expected given the nature of the algorithms.
Collapse
Affiliation(s)
- Mohsen Chitsaz
- Biochemistry and Molecular Biophysics Option, California Institute of Technology, Pasadena, California 91125, USA
| | | |
Collapse
|
17
|
Raval A, Piana S, Eastwood MP, Dror RO, Shaw DE. Refinement of protein structure homology models via long, all-atom molecular dynamics simulations. Proteins 2012; 80:2071-9. [PMID: 22513870 DOI: 10.1002/prot.24098] [Citation(s) in RCA: 184] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2012] [Revised: 04/03/2012] [Accepted: 04/11/2012] [Indexed: 11/07/2022]
Abstract
Accurate computational prediction of protein structure represents a longstanding challenge in molecular biology and structure-based drug design. Although homology modeling techniques are widely used to produce low-resolution models, refining these models to high resolution has proven difficult. With long enough simulations and sufficiently accurate force fields, molecular dynamics (MD) simulations should in principle allow such refinement, but efforts to refine homology models using MD have for the most part yielded disappointing results. It has thus far been unclear whether MD-based refinement is limited primarily by accessible simulation timescales, force field accuracy, or both. Here, we examine MD as a technique for homology model refinement using all-atom simulations, each at least 100 μs long-more than 100 times longer than previous refinement simulations-and a physics-based force field that was recently shown to successfully fold a structurally diverse set of fast-folding proteins. In MD simulations of 24 proteins chosen from the refinement category of recent Critical Assessment of Structure Prediction (CASP) experiments, we find that in most cases, simulations initiated from homology models drift away from the native structure. Comparison with simulations initiated from the native structure suggests that force field accuracy is the primary factor limiting MD-based refinement. This problem can be mitigated to some extent by restricting sampling to the neighborhood of the initial model, leading to structural improvement that, while limited, is roughly comparable to the leading alternative methods.
Collapse
Affiliation(s)
- Alpan Raval
- D E Shaw Research, New York, New York 10036, USA
| | | | | | | | | |
Collapse
|