1
|
Can docking scoring functions guarantee success in virtual screening? VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
2
|
Feig M. Computational protein structure refinement: Almost there, yet still so far to go. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2017; 7:e1307. [PMID: 30613211 PMCID: PMC6319934 DOI: 10.1002/wcms.1307] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein structures are essential in modern biology yet experimental methods are far from being able to catch up with the rapid increase in available genomic data. Computational protein structure prediction methods aim to fill the gap while the role of protein structure refinement is to take approximate initial template-based models and bring them closer to the true native structure. Current methods for computational structure refinement rely on molecular dynamics simulations, related sampling methods, or iterative structure optimization protocols. The best methods are able to achieve moderate degrees of refinement but consistent refinement that can reach near-experimental accuracy remains elusive. Key issues revolve around the accuracy of the energy function, the inability to reliably rank multiple models, and the use of restraints that keep sampling close to the native state but also limit the degree of possible refinement. A different aspect is the question of what exactly the target of high-resolution refinement should be as experimental structures are affected by experimental conditions and different biological questions require varying levels of accuracy. While improvement of the global protein structure is a difficult problem, high-resolution refinement methods that improves local structural quality such as favorable stereochemistry and the avoidance of atomic clashes are much more successful.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, 603 Wilson Rd., Room 218 BCH, East Lansing, MI, USA, ; 517-432-7439
| |
Collapse
|
3
|
Xun S, Jiang F, Wu YD. Significant Refinement of Protein Structure Models Using a Residue-Specific Force Field. J Chem Theory Comput 2015; 11:1949-56. [PMID: 26574396 DOI: 10.1021/acs.jctc.5b00029] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
An important application of all-atom explicit-solvent molecular dynamics (MD) simulations is the refinement of protein structures from low-resolution experiments or template-based modeling. A critical requirement is that the native structure is stable with the force field. We have applied a recently developed residue-specific force field, RSFF1, to a set of 30 refinement targets from recent CASP experiments. Starting from their experimental structures, 1.0 μs unrestrained simulations at 298 K retain most of the native structures quite well except for a few flexible terminals and long internal loops. Starting from each homology model, a 150 ns MD simulation at 380 K generates the best RMSD improvement of 0.85 Å on average. The structural improvements roughly correlate with the RMSD of the initial homology models, indicating possible consistent structure refinement. Finally, targets TR614 and TR624 have been subjected to long-time replica-exchange MD simulations. Significant structural improvements are generated, with RMSD of 1.91 and 1.36 Å with respect to their crystal structures. Thus, it is possible to achieve realistic refinement of protein structure models to near-experimental accuracy, using accurate force field with sufficient conformational sampling.
Collapse
Affiliation(s)
- Sangni Xun
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Yun-Dong Wu
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China.,College of Chemistry and Molecular Engineering, Peking University , Beijing, 100871, China
| |
Collapse
|
4
|
Abstract
By focusing on essential features, while averaging over less important details, coarse-grained (CG) models provide significant computational and conceptual advantages with respect to more detailed models. Consequently, despite dramatic advances in computational methodologies and resources, CG models enjoy surging popularity and are becoming increasingly equal partners to atomically detailed models. This perspective surveys the rapidly developing landscape of CG models for biomolecular systems. In particular, this review seeks to provide a balanced, coherent, and unified presentation of several distinct approaches for developing CG models, including top-down, network-based, native-centric, knowledge-based, and bottom-up modeling strategies. The review summarizes their basic philosophies, theoretical foundations, typical applications, and recent developments. Additionally, the review identifies fundamental inter-relationships among the diverse approaches and discusses outstanding challenges in the field. When carefully applied and assessed, current CG models provide highly efficient means for investigating the biological consequences of basic physicochemical principles. Moreover, rigorous bottom-up approaches hold great promise for further improving the accuracy and scope of CG models for biomolecular systems.
Collapse
Affiliation(s)
- W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
5
|
Mirjalili V, Noyes K, Feig M. Physics-based protein structure refinement through multiple molecular dynamics trajectories and structure averaging. Proteins 2014; 82 Suppl 2:196-207. [PMID: 23737254 PMCID: PMC4212311 DOI: 10.1002/prot.24336] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Revised: 04/30/2013] [Accepted: 05/09/2013] [Indexed: 12/26/2022]
Abstract
We used molecular dynamics (MD) simulations for structure refinement of Critical Assessment of Techniques for Protein Structure Prediction 10 (CASP10) targets. Refinement was achieved by selecting structures from the MD-based ensembles followed by structural averaging. The overall performance of this method in CASP10 is described, and specific aspects are analyzed in detail to provide insight into key components. In particular, the use of different restraint types, sampling from multiple short simulations versus a single long simulation, the success of a quality assessment criterion, the application of scoring versus averaging, and the impact of a final refinement step are discussed in detail.
Collapse
Affiliation(s)
- Vahid Mirjalili
- Department of Mechanical Engineering Michigan State University East Lansing, MI 48824; USA
- Department of Biochemistry and Molecular Biology Michigan State University East Lansing, MI 48824; USA
| | - Keenan Noyes
- Department of Chemistry Michigan State University East Lansing, MI 48824; USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology Michigan State University East Lansing, MI 48824; USA
- Department of Chemistry Michigan State University East Lansing, MI 48824; USA
| |
Collapse
|
6
|
Khoury GA, Thompson JP, Smadbeck J, Kieslich CA, Floudas CA. Forcefield_PTM: Ab Initio Charge and AMBER Forcefield Parameters for Frequently Occurring Post-Translational Modifications. J Chem Theory Comput 2013; 9:5653-5674. [PMID: 24489522 PMCID: PMC3904396 DOI: 10.1021/ct400556v] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
In this work, we introduce Forcefield_PTM, a set of AMBER forcefield parameters consistent with ff03 for 32 common post-translational modifications. Partial charges were calculated through ab initio calculations and a two-stage RESP-fitting procedure in an ether-like implicit solvent environment. The charges were found to be generally consistent with others previously reported for phosphorylated amino acids, and trimethyllysine, using different parameterization methods. Pairs of modified and their corresponding unmodified structures were curated from the PDB for both single and multiple modifications. Background structural similarity was assessed in the context of secondary and tertiary structures from the global dataset. Next, the charges derived for Forcefield_PTM were tested on a macroscopic scale using unrestrained all-atom Langevin molecular dynamics simulations in AMBER for 34 (17 pairs of modified/unmodified) systems in implicit solvent. Assessment was performed in the context of secondary structure preservation, stability in energies, and correlations between the modified and unmodified structure trajectories on the aggregate. As an illustration of their utility, the parameters were used to compare the structural stability of the phosphorylated and dephosphorylated forms of OdhI. Microscopic comparisons between quantum and AMBER single point energies along key χ torsions on several PTMs were performed and corrections to improve their agreement in terms of mean squared errors and squared correlation coefficients were parameterized. This forcefield for post-translational modifications in condensed-phase simulations can be applied to a number of biologically relevant and timely applications including protein structure prediction, protein and peptide design, docking, and to study the effect of PTMs on folding and dynamics. We make the derived parameters and an associated interactive webtool capable of performing post-translational modifications on proteins using Forcefield_PTM available at http://selene.princeton.edu/FFPTM.
Collapse
Affiliation(s)
- George A. Khoury
- Department of Chemical and Biological Engineering, Princeton, NJ, USA
| | - Jeff P. Thompson
- Department of Chemical and Biological Engineering, Princeton, NJ, USA
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton, NJ, USA
| | - Chris A. Kieslich
- Department of Chemical and Biological Engineering, Princeton, NJ, USA
| | | |
Collapse
|
7
|
Khoury GA, Tamamis P, Pinnaduwage N, Smadbeck J, Kieslich CA, Floudas CA. Princeton_TIGRESS: protein geometry refinement using simulations and support vector machines. Proteins 2013; 82:794-814. [PMID: 24174311 DOI: 10.1002/prot.24459] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Revised: 10/18/2013] [Accepted: 10/22/2013] [Indexed: 12/30/2022]
Abstract
Protein structure refinement aims to perform a set of operations given a predicted structure to improve model quality and accuracy with respect to the native in a blind fashion. Despite the numerous computational approaches to the protein refinement problem reported in the previous three CASPs, an overwhelming majority of methods degrade models rather than improve them. We initially developed a method tested using blind predictions during CASP10 which was officially ranked in 5th place among all methods in the refinement category. Here, we present Princeton_TIGRESS, which when benchmarked on all CASP 7,8,9, and 10 refinement targets, simultaneously increased GDT_TS 76% of the time with an average improvement of 0.83 GDT_TS points per structure. The method was additionally benchmarked on models produced by top performing three-dimensional structure prediction servers during CASP10. The robustness of the Princeton_TIGRESS protocol was also tested for different random seeds. We make the Princeton_TIGRESS refinement protocol freely available as a web server at http://atlas.princeton.edu/refinement. Using this protocol, one can consistently refine a prediction to help bridge the gap between a predicted structure and the actual native structure.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, 08540
| | | | | | | | | | | |
Collapse
|
8
|
Mutation induced structural variation in membrane proteins. Chem Res Chin Univ 2013. [DOI: 10.1007/s40242-013-2427-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
9
|
Bhattacharya D, Cheng J. i3Drefine software for protein 3D structure refinement and its assessment in CASP10. PLoS One 2013; 8:e69648. [PMID: 23894517 PMCID: PMC3716612 DOI: 10.1371/journal.pone.0069648] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 06/13/2013] [Indexed: 12/25/2022] Open
Abstract
Protein structure refinement refers to the process of improving the qualities of protein structures during structure modeling processes to bring them closer to their native states. Structure refinement has been drawing increasing attention in the community-wide Critical Assessment of techniques for Protein Structure prediction (CASP) experiments since its addition in 8th CASP experiment. During the 9th and recently concluded 10th CASP experiments, a consistent growth in number of refinement targets and participating groups has been witnessed. Yet, protein structure refinement still remains a largely unsolved problem with majority of participating groups in CASP refinement category failed to consistently improve the quality of structures issued for refinement. In order to alleviate this need, we developed a completely automated and computationally efficient protein 3D structure refinement method, i3Drefine, based on an iterative and highly convergent energy minimization algorithm with a powerful all-atom composite physics and knowledge-based force fields and hydrogen bonding (HB) network optimization technique. In the recent community-wide blind experiment, CASP10, i3Drefine (as ‘MULTICOM-CONSTRUCT’) was ranked as the best method in the server section as per the official assessment of CASP10 experiment. Here we provide the community with free access to i3Drefine software and systematically analyse the performance of i3Drefine in strict blind mode on the refinement targets issued in CASP10 refinement category and compare with other state-of-the-art refinement methods participating in CASP10. Our analysis demonstrates that i3Drefine is only fully-automated server participating in CASP10 exhibiting consistent improvement over the initial structures in both global and local structural quality metrics. Executable version of i3Drefine is freely available at http://protein.rnet.missouri.edu/i3drefine/.
Collapse
Affiliation(s)
- Debswapna Bhattacharya
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
| | - Jianlin Cheng
- Department of Computer Science, Informatics Institute, Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
10
|
Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 2012; 19:1784-95. [PMID: 22153501 DOI: 10.1016/j.str.2011.09.022] [Citation(s) in RCA: 248] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 09/19/2011] [Accepted: 09/24/2011] [Indexed: 11/22/2022]
Abstract
One of critical difficulties of molecular dynamics (MD) simulations in protein structure refinement is that the physics-based energy landscape lacks a middle-range funnel to guide nonnative conformations toward near-native states. We propose to use the target model as a probe to identify fragmental analogs from PDB. The distance maps are then used to reshape the MD energy funnel. The protocol was tested on 181 benchmarking and 26 CASP targets. It was found that structure models of correct folds with TM-score >0.5 can be often pulled closer to native with higher GDT-HA score, but improvement for the models of incorrect folds (TM-score <0.5) are much less pronounced. These data indicate that template-based fragmental distance maps essentially reshaped the MD energy landscape from golf-course-like to funnel-like ones in the successfully refined targets with a radius of TM-score ∼0.5. These results demonstrate a new avenue to improve high-resolution structures by combining knowledge-based template information with physics-based MD simulations.
Collapse
|
11
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
12
|
Du S, Harano Y, Kinoshita M, Sakurai M. A scoring function based on solvation thermodynamics for protein structure prediction. Biophysics (Nagoya-shi) 2012; 8:127-38. [PMID: 27493529 PMCID: PMC4629643 DOI: 10.2142/biophysics.8.127] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 07/31/2012] [Indexed: 12/01/2022] Open
Abstract
We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed.
Collapse
Affiliation(s)
- Shiqiao Du
- Center for Biological Resources and Informatics, Tokyo Institute of Technology, Yokohama 226-8501, Japan
| | - Yuichi Harano
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Masahiro Kinoshita
- Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Minoru Sakurai
- Center for Biological Resources and Informatics, Tokyo Institute of Technology, Yokohama 226-8501, Japan
| |
Collapse
|
13
|
Danielson ML, Lill MA. Predicting flexible loop regions that interact with ligands: the challenge of accurate scoring. Proteins 2011; 80:246-60. [PMID: 22072600 DOI: 10.1002/prot.23199] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2011] [Revised: 09/06/2011] [Accepted: 09/13/2011] [Indexed: 01/12/2023]
Abstract
Flexible loop regions play a critical role in the biological function of many proteins and have been shown to be involved in ligand binding. In the context of structure-based drug design, using or predicting an incorrect loop configuration can be detrimental to the study if the loop is capable of interacting with the ligand. Three protein systems, each with at least one flexible loop region in close proximity to the known binding site, were selected for loop prediction using the CorLps program; a six residue loop region from phosphoribosylglycinamide formyltransferase (GART), two nine residue loop regions from cytochrome P450 (CYP) 119, and an 11 residue loop region from enolase were selected for loop prediction. The results of this study indicate that the statistically based DFIRE scoring function implemented in the CorLps program did not accurately rank native-like predicted loop configurations in any protein system. In an attempt to improve the ranking of the native-like predicted loop configurations, the MM/GBSA and the optimized MM/GBSA-dsr scoring functions were used to re-rank the predicted loops with and without bound ligand. In general, single snapshot MM/GBSA scoring provided the best ranking of native-like loop configurations. Based on the scoring function analyses presented, the optimal ranking of native-like loop configurations is still a difficult challenge and the choice of the "best" scoring function appears to be system dependent.
Collapse
Affiliation(s)
- Matthew L Danielson
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
14
|
MacCallum JL, Pérez A, Schnieders MJ, Hua L, Jacobson MP, Dill KA. Assessment of protein structure refinement in CASP9. Proteins 2011; 79 Suppl 10:74-90. [PMID: 22069034 DOI: 10.1002/prot.23131] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Revised: 06/15/2011] [Accepted: 07/03/2011] [Indexed: 11/06/2022]
Abstract
We assess performance in the structure refinement category in CASP9. Two years after CASP8, the performance of the best groups has not improved. There are few groups that improve any of our assessment scores with statistical significance. Some predictors, however, are able to consistently improve the physicality of the models. Although we cannot identify any clear bottleneck in improving refinement, several points arise: (1) The refinement portion of CASP has too few targets to make many statistically meaningful conclusions. (2) Predictors are usually very conservative, limiting the possibility of large improvements in models. (3) No group is actually able to correctly rank their five submissions-indicating that potentially better models may be discarded. (4) Different sampling strategies work better for different refinement problems; there is no single strategy that works on all targets. In general, conservative strategies do better, while the greatest improvements come from more adventurous sampling-at the cost of consistency. Comparison with experimental data reveals aspects not captured by comparison to a single structure. In particular, we show that improvement in backbone geometry does not always mean better agreement with experimental data. Finally, we demonstrate that even given the current challenges facing refinement, the refined models are useful for solving the crystallographic phase problem through molecular replacement. Proteins 2011;. © 2011 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Justin L MacCallum
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.
| | | | | | | | | | | |
Collapse
|
15
|
Olson MA, Chaudhury S, Lee MS. Comparison between self-guided Langevin dynamics and molecular dynamics simulations for structure refinement of protein loop conformations. J Comput Chem 2011; 32:3014-22. [PMID: 21793008 DOI: 10.1002/jcc.21883] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Accepted: 06/19/2011] [Indexed: 11/09/2022]
Abstract
This article presents a comparative analysis of two replica-exchange simulation methods for the structure refinement of protein loop conformations, starting from low-resolution predictions. The methods are self-guided Langevin dynamics (SGLD) and molecular dynamics (MD) with a Nosé-Hoover thermostat. We investigated a small dataset of 8- and 12-residue loops, with the shorter loops placed initially from a coarse-grained lattice model and the longer loops from an enumeration assembly method (the Loopy program). The CHARMM22 + CMAP force field with a generalized Born implicit solvent model (molecular-surface parameterized GBSW2) was used to explore conformational space. We also assessed two empirical scoring methods to detect nativelike conformations from decoys: the all-atom distance-scaled ideal-gas reference state (DFIRE-AA) statistical potential and the Rosetta energy function. Among the eight-residue loop targets, SGLD out performed MD in all cases, with a median of 0.48 Å reduction in global root-mean-square deviation (RMSD) of the loop backbone coordinates from the native structure. Among the more challenging 12-residue loop targets, SGLD improved the prediction accuracy over MD by a median of 1.31 Å, representing a substantial improvement. The overall median RMSD for SGLD simulations of 12-residue loops was 0.91 Å, yielding refinement of a median 2.70 Å from initial loop placement. Results from DFIRE-AA and the Rosetta model applied to rescoring conformations failed to improve the overall detection calculated from the CHARMM force field. We illustrate the advantage of SGLD over the MD simulation model by presenting potential-energy landscapes for several loop predictions. Our results demonstrate that SGLD significantly outperforms traditional MD in the generation and populating of nativelike loop conformations and that the CHARMM force field performs comparably to other empirical force fields in identifying these conformations from the resulting ensembles.
Collapse
Affiliation(s)
- Mark A Olson
- Department of Cell Biology and Biochemistry, US Army Medical Research Institute of Infectious Diseases, Fredrick, Maryland 21702, USA.
| | | | | |
Collapse
|
16
|
Li DW, Brüschweiler R. Iterative Optimization of Molecular Mechanics Force Fields from NMR Data of Full-Length Proteins. J Chem Theory Comput 2011; 7:1773-82. [PMID: 26596440 DOI: 10.1021/ct200094b] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
High quality molecular mechanics force fields of proteins are key for the quantitative interpretation of experimental data and the predictive understanding of protein function based on computer simulations. A strategy is presented for the optimization of protein force fields based on full-length proteins in their native environment that is guided by experimental NMR chemical shifts and residual dipolar couplings (RDCs). An energy-based reweighting approach is applied to a long molecular dynamics trajectory, performed with a parent force field, to efficiently screen a large number of trial force fields. The force field that yields the best agreement with the experimental data is then used as the new parent force field, and the procedure is repeated until no further improvement is obtained. This method is demonstrated for the optimization of the backbone φ,ψ dihedral angle potential of the Amber ff99SB force field using six trial proteins and another 17 proteins for cross-validation using (13)C chemical shifts with and without backbone RDCs. The φ,ψ dihedral angle potential is systematically improved by the inclusion of correlation effects through the addition of up to 24 bivariate Gaussian functions of variable height, width, and tilt angle. The resulting force fields, termed ff99SB_φψ(g24;CS) and ff99SB_φψ(g8;CS,RDC), perform significantly better than their parent force field in terms of both NMR data reproduction and Cartesian coordinate root-mean-square deviations between the MD trajectories and the X-ray crystal structures. The strategy introduced here represents a powerful addition to force field optimization approaches by overcoming shortcomings of methods that are solely based on quantum-chemical calculations of small molecules and protein fragments in the gas phase.
Collapse
Affiliation(s)
- Da-Wei Li
- Chemical Sciences Laboratory, Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University , Tallahassee, Florida 32306, United States
| | - Rafael Brüschweiler
- Chemical Sciences Laboratory, Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University , Tallahassee, Florida 32306, United States
| |
Collapse
|
17
|
Shirota M, Ishida T, Kinoshita K. Absolute quality evaluation of protein model structures using statistical potentials with respect to the native and reference states. Proteins 2011; 79:1550-63. [PMID: 21365682 DOI: 10.1002/prot.22982] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2010] [Revised: 11/19/2010] [Accepted: 12/19/2010] [Indexed: 11/06/2022]
Abstract
In protein structure prediction, it is crucial to evaluate the degree of native-likeness of given model structures. Statistical potentials extracted from protein structure data sets are widely used for such quality assessment problems, but they are only applicable for comparing different models of the same protein. Although various other methods, such as machine learning approaches, were developed to predict the absolute similarity of model structures to the native ones, they required a set of decoy structures in addition to the model structures. In this paper, we tried to reformulate the statistical potentials as absolute quality scores, without using the information from decoy structures. For this purpose, we regarded the native state and the reference state, which are necessary components of statistical potentials, as the good and bad standard states, respectively, and first showed that the statistical potentials can be regarded as the state functions, which relate a model structure to the native and reference states. Then, we proposed a standardized measure of protein structure, called native-likeness, by interpolating the score of a model structure between the native and reference state scores defined for each protein. The native-likeness correlated with the similarity to the native structures and discriminated the native structures from the models, with better accuracy than the raw score. Our results show that statistical potentials can quantify the native-like properties of protein structures, if they fully utilize the statistical information obtained from the data set.
Collapse
Affiliation(s)
- Matsuyuki Shirota
- Department of Applied Information Sciences, Graduate School of Information Science, Tohoku University, 6-3-09, Aoba, Aramaki, Aoba-Ku, Sendai, Miyagi 980-8579, Japan
| | | | | |
Collapse
|
18
|
Betancourt MR. Optimization of Monte Carlo trial moves for protein simulations. J Chem Phys 2011; 134:014104. [PMID: 21218994 DOI: 10.1063/1.3515960] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Closed rigid-body rotations of residue segments under bond-angle restraints are simple and effective Monte Carlo moves for searching the conformational space of proteins. The efficiency of these moves is examined here as a function of the number of moving residues and the magnitude of their displacement. It is found that the efficiency of folding and equilibrium simulations can be significantly improved by tailoring the distribution of the number of moving residues to the simulation temperature. In general, simulations exploring compact conformations are more efficient when the average number of moving residues is smaller. It is also demonstrated that the moves do not require additional restrictions on the magnitude of the rotation displacements and perform much better than other rotation moves that do not restrict the bond angles a priori. As an example, these results are applied to the replica exchange method. By assigning distributions that generate a smaller number of moving residues to lower temperature replicas, the simulation times are decreased as long as the higher temperature replicas are effective.
Collapse
Affiliation(s)
- Marcos R Betancourt
- Department of Physics, Indiana University Purdue University Indianapolis, 402 N. Blackford St., LD156-J Indianapolis, Indiana 46202, USA.
| |
Collapse
|
19
|
Xu T, Zhang L, Wang X, Wei D. A novel protocol of energy optimisation for predicted protein structures built by homology modelling. MOLECULAR SIMULATION 2010. [DOI: 10.1080/08927022.2010.513771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
20
|
Lin MS, Head-Gordon T. Reliable protein structure refinement using a physical energy function. J Comput Chem 2010; 32:709-17. [DOI: 10.1002/jcc.21664] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2010] [Revised: 08/02/2010] [Accepted: 08/07/2010] [Indexed: 11/10/2022]
|
21
|
Brylinski M, Lee SY, Zhou H, Skolnick J. The utility of geometrical and chemical restraint information extracted from predicted ligand-binding sites in protein structure refinement. J Struct Biol 2010; 173:558-69. [PMID: 20850544 DOI: 10.1016/j.jsb.2010.09.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Revised: 09/08/2010] [Accepted: 09/10/2010] [Indexed: 01/01/2023]
Abstract
Exhaustive exploration of molecular interactions at the level of complete proteomes requires efficient and reliable computational approaches to protein function inference. Ligand docking and ranking techniques show considerable promise in their ability to quantify the interactions between proteins and small molecules. Despite the advances in the development of docking approaches and scoring functions, the genome-wide application of many ligand docking/screening algorithms is limited by the quality of the binding sites in theoretical receptor models constructed by protein structure prediction. In this study, we describe a new template-based method for the local refinement of ligand-binding regions in protein models using remotely related templates identified by threading. We designed a Support Vector Regression (SVR) model that selects correct binding site geometries in a large ensemble of multiple receptor conformations. The SVR model employs several scoring functions that impose geometrical restraints on the Cα positions, account for the specific chemical environment within a binding site and optimize the interactions with putative ligands. The SVR score is well correlated with the RMSD from the native structure; in 47% (70%) of the cases, the Pearson's correlation coefficient is >0.5 (>0.3). When applied to weakly homologous models, the average heavy atom, local RMSD from the native structure of the top-ranked (best of top five) binding site geometries is 3.1Å (2.9Å) for roughly half of the targets; this represents a 0.1 (0.3)Å average improvement over the original predicted structure. Focusing on the subset of strongly conserved residues, the average heavy atom RMSD is 2.6Å (2.3Å). Furthermore, we estimate the upper bound of template-based binding site refinement using only weakly related proteins to be ∼2.6Å RMSD. This value also corresponds to the plasticity of the ligand-binding regions in distant homologues. The Binding Site Refinement (BSR) approach is available to the scientific community as a web server that can be accessed at http://cssb.biology.gatech.edu/bsr/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, GA 30318, USA
| | | | | | | |
Collapse
|
22
|
Danielson ML, Lill MA. New computational method for prediction of interacting protein loop regions. Proteins 2010; 78:1748-59. [PMID: 20186974 DOI: 10.1002/prot.22690] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Flexible loop regions of proteins play a crucial role in many biological functions such as protein-ligand recognition, enzymatic catalysis, and protein-protein association. To date, most computational methods that predict the conformational states of loops only focus on individual loop regions. However, loop regions are often spatially in close proximity to one another and their mutual interactions stabilize their conformations. We have developed a new method, titled CorLps, capable of simultaneously predicting such interacting loop regions. First, an ensemble of individual loop conformations is generated for each loop region. The members of the individual ensembles are combined and are accepted or rejected based on a steric clash filter. After a subsequent side-chain optimization step, the resulting conformations of the interacting loops are ranked by the statistical scoring function DFIRE that originated from protein structure prediction. Our results show that predicting interacting loops with CorLps is superior to sequential prediction of the two interacting loop regions, and our method is comparable in accuracy to single loop predictions. Furthermore, improved predictive accuracy of the top-ranked solution is achieved for 12-residue length loop regions by diversifying the initial pool of individual loop conformations using a quality threshold clustering algorithm.
Collapse
Affiliation(s)
- Matthew L Danielson
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
23
|
Application of biasing-potential replica-exchange simulations for loop modeling and refinement of proteins in explicit solvent. Proteins 2010; 78:2809-19. [DOI: 10.1002/prot.22796] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
24
|
Yang YD, Spratt P, Chen H, Park C, Kihara D. Sub-AQUA: real-value quality assessment of protein structure models. Protein Eng Des Sel 2010; 23:617-32. [PMID: 20525730 DOI: 10.1093/protein/gzq030] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Computational protein tertiary structure prediction has made significant progress over the past years. However, most of the existing structure prediction methods are not equipped with functionality to predict accuracy of constructed models. Knowing the accuracy of a structure model is crucial for its practical use since the accuracy determines potential applications of the model. Here we have developed quality assessment methods, which predict real value of the global and local quality of protein structure models. The global quality of a model is defined as the root mean square deviation (RMSD) and the LGA score to its native structure. The local quality is defined as the distance between the corresponding Calpha positions of a model and its native structure when they are superimposed. Three regression methods are employed to combine different types of quality assessment measures of models, including alignment-level scores, residue-position level scores, atomic-detailed structure level scores and composite scores. The regression models were tested on a large benchmark data set of template-based protein structure models of various qualities. In predicting RMSD and the LGA score, a combination of two terms, length-normalized SPAD, a score that assesses alignment stability by considering suboptimal alignments, and Verify3D normalized by the square of the model length shows a significant performance, achieving 97.1 and 83.6% accuracy in identifying models with an RMSD of <2 and 6 A, respectively. For predicting the local quality of models, we find that a two-step approach, in which the global RMSD predicted in the first step is further combined with the other terms, can dramatically increase the accuracy. Finally, the developed regression equations are applied to assess the quality of structure models of whole E. coli proteome.
Collapse
Affiliation(s)
- Yifeng David Yang
- Department of Biological Sciences, College of Science, Purdue University, West Lafayette, IN 47907, USA
| | | | | | | | | |
Collapse
|
25
|
Vorobjev YN. Blind docking method combining search of low-resolution binding sites with ligand pose refinement by molecular dynamics-based global optimization. J Comput Chem 2010; 31:1080-92. [PMID: 19821514 DOI: 10.1002/jcc.21394] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
This study describes the development of a new blind hierarchical docking method, bhDock, its implementation, and accuracy assessment. The bhDock method uses two-step algorithm. First, a comprehensive set of low-resolution binding sites is determined by analyzing entire protein surface and ranked by a simple score function. Second, ligand position is determined via a molecular dynamics-based method of global optimization starting from a small set of high ranked low-resolution binding sites. The refinement of the ligand binding pose starts from uniformly distributed multiple initial ligand orientations and uses simulated annealing molecular dynamics coupled with guided force-field deformation of protein-ligand interactions to find the global minimum. Assessment of the bhDock method on the set of 37 protein-ligand complexes has shown the success rate of predictions of 78%, which is better than the rate reported for the most cited docking methods, such as AutoDock, DOCK, GOLD, and FlexX, on the same set of complexes.
Collapse
Affiliation(s)
- Yury N Vorobjev
- Institute of Chemical Biology and Fundamental Medicine of the Siberian Branch of the Russian Academy of Science, Novosibirsk, Russia.
| |
Collapse
|
26
|
Anishkin A, Milac AL, Guy HR. Symmetry-restrained molecular dynamics simulations improve homology models of potassium channels. Proteins 2010; 78:932-49. [PMID: 19902533 DOI: 10.1002/prot.22618] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Most crystallized homo-oligomeric ion channels are highly symmetric, which dramatically decreases conformational space and facilitates building homology models (HMs). However, in molecular dynamics (MD) simulations channels deviate from ideal symmetry and accumulate thermal defects, which complicate the refinement of HMs using MD. In this work we evaluate the ability of symmetry constrained MD simulations to improve HMs accuracy, using an approach conceptually similar to Critical Assessment of techniques for protein Structure Prediction (CASP) competition: build HMs of channels with known structure and evaluate the efficiency of proposed methods in improving HMs accuracy (measured as deviation from experimental structure). Results indicate that unrestrained MD does not improve the accuracy of HMs, instantaneous symmetrization improves accuracy but not stability of HMs during subsequent unrestrained MD, while gradually imposing symmetry constraints improves both accuracy (by 5-50%) and stability of HMs. Moreover, accuracy and stability are strongly correlated, making stability a reliable criterion in predicting the accuracy of new HMs. Proteins 2010. (c) 2009 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Andriy Anishkin
- Department of Biology, University of Maryland, College Park, Maryland 20742, USA
| | | | | |
Collapse
|
27
|
MacCallum JL, Hua L, Schnieders MJ, Pande VS, Jacobson MP, Dill KA. Assessment of the protein-structure refinement category in CASP8. Proteins 2010; 77 Suppl 9:66-80. [PMID: 19714776 DOI: 10.1002/prot.22538] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Here, we summarize the assessment of protein structure refinement in CASP8. Twenty-four groups refined a total of 12 target proteins. Averaging over all groups and all proteins, there was no net improvement over the original starting models. However, there are now some individual research groups who consistently do improve protein structures relative to a starting starting model. We compare various measures of quality assessment, including (i) standard backbone-based methods, (ii) new methods from the Richardson group, and (iii) ensemble-based methods for comparing experimental structures, such as NMR NOE violations and the suitability of the predicted models to serve as templates for molecular replacement. On the whole, there is a general correlation among various measures. However, there are interesting differences. Sometimes a structure that is in better agreement with the experimental data is judged to be slightly worse by GDT-TS. This suggests that for comparing protein structures that are already quite close to the native, it may be preferable to use ensemble-based experimentally derived measures of quality, in addition to single-structure-based methods such as GDT-TS.
Collapse
Affiliation(s)
- Justin L MacCallum
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California 94158, USA
| | | | | | | | | | | |
Collapse
|
28
|
Subramani A, DiMaggio PA, Floudas CA. Selecting high quality protein structures from diverse conformational ensembles. Biophys J 2009; 97:1728-36. [PMID: 19751678 DOI: 10.1016/j.bpj.2009.06.046] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2009] [Revised: 06/15/2009] [Accepted: 06/30/2009] [Indexed: 01/01/2023] Open
Abstract
Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are closest to the native structure. The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration, which are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. The method, denoted as ICON, has been tested on four data sets: 1), 1400 proteins with high resolution decoys; 2), medium-to-low resolution decoys from Decoys 'R' Us; 3), medium-to-low resolution decoys from the first-principles approach, ASTRO-FOLD; and 4), selected targets from CASP8. The extensive tests demonstrate that ICON can identify high-quality structures in each ensemble, regardless of the resolution of conformers. In a total of 1454 proteins, with an average of 1051 conformers per protein, the conformers selected by ICON are, on an average, in the top 3.5% of the conformers in the ensemble.
Collapse
Affiliation(s)
- Ashwin Subramani
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey, USA
| | | | | |
Collapse
|
29
|
Aloy P, Oliva B. Splitting statistical potentials into meaningful scoring functions: testing the prediction of near-native structures from decoy conformations. BMC STRUCTURAL BIOLOGY 2009; 9:71. [PMID: 19917096 PMCID: PMC2783033 DOI: 10.1186/1472-6807-9-71] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 11/16/2009] [Indexed: 11/20/2022]
Abstract
Background Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both. Results Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score () we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors. Conclusion We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.
Collapse
Affiliation(s)
- Patrick Aloy
- Institut de Recerca Biomèdica and Barcelona Supercomputing Center, 10-12 08028 Barcelona, Catalonia, Spain.
| | | |
Collapse
|
30
|
Arnautova YA, Vorobjev YN, Vila JA, Scheraga HA. Identifying native-like protein structures with scoring functions based on all-atom ECEPP force fields, implicit solvent models and structure relaxation. Proteins 2009; 77:38-51. [PMID: 19384995 DOI: 10.1002/prot.22414] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Availability of energy functions which can discriminate native-like from non-native protein conformations is crucial for theoretical protein structure prediction and refinement of low-resolution protein models. This article reports the results of benchmark tests for scoring functions based on two all-atom ECEPP force fields, that is, ECEPP/3 and ECEPP05, and two implicit solvent models for a large set of protein decoys. The following three scoring functions are considered: (i) ECEPP05 plus a solvent-accessible surface area model with the parameters optimized with a set of protein decoys (ECEPP05/SA); (ii) ECEPP/3 plus the solvent-accessible surface area model of Ooi et al. (Proc Natl Acad Sci USA 1987;84:3086-3090) (ECEPP3/OONS); and (iii) ECEPP05 plus an implicit solvent model based on a solution of the Poisson equation with an optimized Fast Adaptive Multigrid Boundary Element (FAMBEpH) method (ECEPP05/FAMBEpH). Short Monte Carlo-with-Minimization (MCM) simulations, following local energy minimization, are used as a scoring method with ECEPP05/SA and ECEPP3/OONS potentials, whereas energy calculation is used with ECEPP05/FAMBEpH. The performance of each scoring function is evaluated by examining its ability to distinguish between native-like and non-native protein structures. The results of the tests show that the new ECEPP05/SA scoring function represents a significant improvement over the earlier ECEPP3/OONS version of the force field. Thus, it is able to rank native-like structures with C(alpha) root-mean-square-deviations below 3.5 A as lowest-energy conformations for 76% and within the top 10 for 87% of the proteins tested, compared with 69 and 80%, respectively, for ECEPP3/OONS. The use of the FAMBEpH solvation model, which provides a more accurate description of the protein-solvent interactions, improves the discriminative ability of the scoring function to 89%. All failed tests in which the native-like structures cannot be discriminated as those with low energy, are due to omission of protein-protein interactions. The results of this study represent a benchmark in force-field development, and may be useful for evaluation of the performance of different force fields.
Collapse
Affiliation(s)
- Yelena A Arnautova
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca New York 14853-1301, USA
| | | | | | | |
Collapse
|
31
|
|
32
|
Zhang Y. Protein structure prediction: when is it useful? Curr Opin Struct Biol 2009; 19:145-55. [PMID: 19327982 PMCID: PMC2673339 DOI: 10.1016/j.sbi.2009.02.005] [Citation(s) in RCA: 191] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2008] [Revised: 02/18/2009] [Accepted: 02/19/2009] [Indexed: 10/21/2022]
Abstract
Computationally predicted three-dimensional structure of protein molecules has demonstrated the usefulness in many areas of biomedicine, ranging from approximate family assignments to precise drug screening. For nearly 40 years, however, the accuracy of the predicted models has been dictated by the availability of close structural templates. Progress has recently been achieved in refining low-resolution models closer to the native ones; this has been made possible by combining knowledge-based information from multiple sources of structural templates as well as by improving the energy funnel of physics-based force fields. Unfortunately, there has been no essential progress in the development of techniques for detecting remotely homologous templates and for predicting novel protein structures.
Collapse
Affiliation(s)
- Yang Zhang
- Center for Bioinformatics and Department of Molecular Biosciences, University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA.
| |
Collapse
|
33
|
Arnautova YA, Scheraga HA. Use of decoys to optimize an all-atom force field including hydration. Biophys J 2008; 95:2434-49. [PMID: 18502794 PMCID: PMC2517034 DOI: 10.1529/biophysj.108.133587] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Accepted: 05/07/2008] [Indexed: 11/18/2022] Open
Abstract
A novel method of parameter optimization is proposed. It makes use of large sets of decoys generated for six nonhomologous proteins with different architecture. Parameter optimization is achieved by creating a free energy gap between sets of nativelike and nonnative conformations. The method is applied to optimize the parameters of a physics-based scoring function consisting of the all-atom ECEPP05 force field coupled with an implicit solvent model (a solvent-accessible surface area model). The optimized force field is able to discriminate near-native from nonnative conformations of the six training proteins when used either for local energy minimization or for short Monte Carlo simulated annealing runs after local energy minimization. The resulting force field is validated with an independent set of six nonhomologous proteins, and appears to be transferable to proteins not included in the optimization; i.e., for five out of the six test proteins, decoys with 1.7- to 4.0-A all-heavy-atom root mean-square deviations emerge as those with the lowest energy. In addition, we examined the set of misfolded structures created by Park and Levitt using a four-state reduced model. The results from these additional calculations confirm the good discriminative ability of the optimized force field obtained with our decoy sets.
Collapse
Affiliation(s)
- Yelena A Arnautova
- Department of Chemistry and Chemical Biology, Baker Laboratory, Cornell University, Ithaca, New York 14853-1301, USA
| | | |
Collapse
|
34
|
Protein model refinement using an optimized physics-based all-atom force field. Proc Natl Acad Sci U S A 2008; 105:8268-73. [PMID: 18550813 DOI: 10.1073/pnas.0800054105] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One of the greatest challenges in protein structure prediction is the refinement of low-resolution predicted models to high-resolution structures that are close to the native state. Although contemporary structure prediction methods can assemble the correct topology for a large fraction of protein domains, such approximate models are often not of the resolution required for many important applications, including studies of reaction mechanisms and virtual ligand screening. Thus, the development of a method that could bring those structures closer to the native state is of great importance. We recently optimized the relative weights of the components of the Amber ff03 potential on a large set of decoy structures to create a funnel-shaped energy landscape with the native structure at the global minimum. Such an energy function might be able to drive proteins toward their native structure. In this work, for a test set of 47 proteins, with 100 decoy structures per protein that have a range of structural similarities to the native state, we demonstrate that our optimized potential can drive protein models closer to their native structure. Comparing the lowest-energy structure from each trajectory with the starting decoy, structural improvement is seen for 70% of the models on average. The ability to do such systematic structural refinements by using a physics-based all-atom potential represents a promising approach to high-resolution structure prediction.
Collapse
|