Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Qiu J, Elber R. Atomically detailed potentials to recognize native and approximate protein structures. Proteins 2006;61:44-55. [PMID: 16080157 DOI: 10.1002/prot.20585] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

For:	Qiu J, Elber R. Atomically detailed potentials to recognize native and approximate protein structures. Proteins 2006;61:44-55. [PMID: 16080157 DOI: 10.1002/prot.20585] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Number

Cited by Other Article(s)

Binette V, Mousseau N, Tuffery P. A Generalized Attraction-Repulsion Potential and Revisited Fragment Library Improves PEP-FOLD Peptide Structure Prediction. J Chem Theory Comput 2022;18:2720-2736. [PMID: 35298162 DOI: 10.1021/acs.jctc.1c01293] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract

Fast and accurate structure prediction is essential to the study of peptide function, molecular targets, and interactions and has been the subject of considerable efforts in the past decade. In this work, we present improvements to the popular simplified PEP-FOLD technique for small peptide structure prediction. PEP-FOLD originality is threefold: (i) it uses a predetermined structural alphabet, (ii) it uses a sequential algorithm to reconstruct the tridimensional structures of these peptides in a discrete space using a fragment library, and (iii) it assesses the energy of these structures using a coarse-grained representation in which all of the backbone atoms but the α-hydrogen are present, and the side chain corresponds to a unique bead. In former versions of PEP-FOLD, a van der Waals formulation was used for non-bonded interactions, with each side chain being associated with a fixed radius. Here, we explore the relevance of using instead a generalized formulation in which not only the optimal distance of interaction and the energy at this distance are parameters but also the distance at which the potential is zero. This allows each side chain to be associated with a different radius and potential energy shape, depending on its interaction partner, and in principle to make more effective the coarse-grained representation. In addition, the new PEP-FOLD version is associated with an updated library of fragments. We show that these modifications lead to important improvements for many of the problematic targets identified with the former PEP-FOLD version while maintaining already correct predictions. The improvement is in terms of both model ranking and model accuracy. We also compare the PEP-FOLD enhanced version to state-of-the-art techniques for both peptide and structure predictions: APPTest, RaptorX, and AlphaFold2. We find that the new predictions are superior, in particular with respect to the prediction of small β-targets, to those of APPTest and RaptorX and bring, with its original approach, additional understanding on folded structures, even when less precise than AlphaFold2. With their strong physical influence, the revised structural library and coarse-grained potential offer, however, the means for a deeper understanding of the nature of folding and open a solid basis for studying flexibility and other dynamical properties not accessible to IA structure prediction approaches.

Collapse

Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018;53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]

Convex-PL: a novel knowledge-based potential for protein-ligand interactions deduced from structural databases using convex optimization. J Comput Aided Mol Des 2017;31:943-958. [DOI: 10.1007/s10822-017-0068-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Accepted: 09/08/2017] [Indexed: 12/16/2022]

Neveu E, Ritchie DW, Popov P, Grudinin S. PEPSI-Dock: a detailed data-driven protein-protein interaction potential accelerated by polar Fourier correlation. Bioinformatics 2017;32:i693-i701. [PMID: 27587691 DOI: 10.1093/bioinformatics/btw443] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Abstract

MOTIVATION

Docking prediction algorithms aim to find the native conformation of a complex of proteins from knowledge of their unbound structures. They rely on a combination of sampling and scoring methods, adapted to different scales. Polynomial Expansion of Protein Structures and Interactions for Docking (PEPSI-Dock) improves the accuracy of the first stage of the docking pipeline, which will sharpen up the final predictions. Indeed, PEPSI-Dock benefits from the precision of a very detailed data-driven model of the binding free energy used with a global and exhaustive rigid-body search space. As well as being accurate, our computations are among the fastest by virtue of the sparse representation of the pre-computed potentials and FFT-accelerated sampling techniques. Overall, this is the first demonstration of a FFT-accelerated docking method coupled with an arbitrary-shaped distance-dependent interaction potential.

RESULTS

First, we present a novel learning process to compute data-driven distant-dependent pairwise potentials, adapted from our previous method used for rescoring of putative protein-protein binding poses. The potential coefficients are learned by combining machine-learning techniques with physically interpretable descriptors. Then, we describe the integration of the deduced potentials into a FFT-accelerated spherical sampling provided by the Hex library. Overall, on a training set of 163 heterodimers, PEPSI-Dock achieves a success rate of 91% mid-quality predictions in the top-10 solutions. On a subset of the protein docking benchmark v5, it achieves 44.4% mid-quality predictions in the top-10 solutions when starting from bound structures and 20.5% when starting from unbound structures. The method runs in 5-15 min on a modern laptop and can easily be extended to other types of interactions.

AVAILABILITY AND IMPLEMENTATION

https://team.inria.fr/nano-d/software/PEPSI-Dock

CONTACT

sergei.grudinin@inria.fr.

Collapse

Jing X, Dong Q. MQAPRank: improved global protein model quality assessment by learning-to-rank. BMC Bioinformatics 2017;18:275. [PMID: 28545390 PMCID: PMC5445322 DOI: 10.1186/s12859-017-1691-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 05/16/2017] [Indexed: 11/10/2022] Open

Grudinin S, Kadukova M, Eisenbarth A, Marillet S, Cazals F. Predicting binding poses and affinities for protein - ligand complexes in the 2015 D3R Grand Challenge using a physical model with a statistical parameter estimation. J Comput Aided Mol Des 2016;30:791-804. [DOI: 10.1007/s10822-016-9976-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 09/19/2016] [Indexed: 12/14/2022]

Jing X, Wang K, Lu R, Dong Q. Sorting protein decoys by machine-learning-to-rank. Sci Rep 2016;6:31571. [PMID: 27530967 PMCID: PMC4987638 DOI: 10.1038/srep31571] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 07/26/2016] [Indexed: 11/18/2022] Open

Grudinin S, Popov P, Neveu E, Cheremovskiy G. Predicting Binding Poses and Affinities in the CSAR 2013–2014 Docking Exercises Using the Knowledge-Based Convex-PL Potential. J Chem Inf Model 2015;56:1053-62. [DOI: 10.1021/acs.jcim.5b00339] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Popov P, Grudinin S. Knowledge of Native Protein–Protein Interfaces Is Sufficient To Construct Predictive Models for the Selection of Binding Candidates. J Chem Inf Model 2015;55:2242-55. [DOI: 10.1021/acs.jcim.5b00372] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Chae MH, Krull F, Knapp EW. Optimized distance-dependent atom-pair-based potential DOOP for protein structure prediction. Proteins 2015;83:881-90. [PMID: 25693513 DOI: 10.1002/prot.24782] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Revised: 02/06/2015] [Accepted: 02/10/2015] [Indexed: 12/20/2022]

Abstract

The DOcking decoy-based Optimized Potential (DOOP) energy function for protein structure prediction is based on empirical distance-dependent atom-pair interactions. To optimize the atom-pair interactions, native protein structures are decomposed into polypeptide chain segments that correspond to structural motives involving complete secondary structure elements. They constitute near native ligand-receptor systems (or just pairs). Thus, a total of 8609 ligand-receptor systems were prepared from 954 selected proteins. For each of these hypothetical ligand-receptor systems, 1000 evenly sampled docking decoys with 0-10 Å interface root-mean-square-deviation (iRMSD) were generated with a method used before for protein-protein docking. A neural network-based optimization method was applied to derive the optimized energy parameters using these decoys so that the energy function mimics the funnel-like energy landscape for the interaction between these hypothetical ligand-receptor systems. Thus, our method hierarchically models the overall funnel-like energy landscape of native protein structures. The resulting energy function was tested on several commonly used decoy sets for native protein structure recognition and compared with other statistical potentials. In combination with a torsion potential term which describes the local conformational preference, the atom-pair-based potential outperforms other reported statistical energy functions in correct ranking of native protein structures for a variety of decoy sets. This is especially the case for the most challenging ROSETTA decoy set, although it does not take into account side chain orientation-dependence explicitly. The DOOP energy function for protein structure prediction, the underlying database of protein structures with hypothetical ligand-receptor systems and their decoys are freely available at http://agknapp.chemie.fu-berlin.de/doop/.

Collapse

Huang SY, Zou X. ITScorePro: an efficient scoring program for evaluating the energy scores of protein structures for structure prediction. Methods Mol Biol 2014;1137:71-81. [PMID: 24573475 PMCID: PMC11121506 DOI: 10.1007/978-1-4939-0366-5_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Dong GQ, Fan H, Schneidman-Duhovny D, Webb B, Sali A. Optimized atomic statistical potentials: assessment of protein interfaces and loops. Bioinformatics 2013;29:3158-66. [PMID: 24078704 PMCID: PMC3842762 DOI: 10.1093/bioinformatics/btt560] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Revised: 08/13/2013] [Accepted: 09/22/2013] [Indexed: 01/16/2023] Open

Abstract

MOTIVATION

Statistical potentials have been widely used for modeling whole proteins and their parts (e.g. sidechains and loops) as well as interactions between proteins, nucleic acids and small molecules. Here, we formulate the statistical potentials entirely within a statistical framework, avoiding questionable statistical mechanical assumptions and approximations, including a definition of the reference state.

RESULTS

We derive a general Bayesian framework for inferring statistically optimized atomic potentials (SOAP) in which the reference state is replaced with data-driven 'recovery' functions. Moreover, we restrain the relative orientation between two covalent bonds instead of a simple distance between two atoms, in an effort to capture orientation-dependent interactions such as hydrogen bonds. To demonstrate this general approach, we computed statistical potentials for protein-protein docking (SOAP-PP) and loop modeling (SOAP-Loop). For docking, a near-native model is within the top 10 scoring models in 40% of the PatchDock benchmark cases, compared with 23 and 27% for the state-of-the-art ZDOCK and FireDock scoring functions, respectively. Similarly, for modeling 12-residue loops in the PLOP benchmark, the average main-chain root mean square deviation of the best scored conformations by SOAP-Loop is 1.5 Å, close to the average root mean square deviation of the best sampled conformations (1.2 Å) and significantly better than that selected by Rosetta (2.1 Å), DFIRE (2.3 Å), DOPE (2.5 Å) and PLOP scoring functions (3.0 Å). Our Bayesian framework may also result in more accurate statistical potentials for additional modeling applications, thus affording better leverage of the experimentally determined protein structures.

AVAILABILITY AND IMPLEMENTATION

SOAP-PP and SOAP-Loop are available as part of MODELLER (http://salilab.org/modeller).

Collapse

Kauffman C, Karypis G. Coarse- and fine-grained models for proteins: Evaluation by decoy discrimination. Proteins 2013. [DOI: 10.1002/prot.24222] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Røgen P, Koehl P. Extracting knowledge from protein structure geometry. Proteins 2013;81:841-51. [PMID: 23280479 DOI: 10.1002/prot.24242] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 11/28/2012] [Accepted: 12/08/2012] [Indexed: 11/06/2022]

Viswanath S, Ravikant DVS, Elber R. Improving ranking of models for protein complexes with side chain modeling and atomic potentials. Proteins 2012. [PMID: 23180599 DOI: 10.1002/prot.24214] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Subramani A, Wei Y, Floudas CA. ASTRO-FOLD 2.0: an Enhanced Framework for Protein Structure Prediction. AIChE J 2012;58:1619-1637. [PMID: 23049093 DOI: 10.1002/aic.12669] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Ravikant DVS, Elber R. Energy design for protein-protein interactions. J Chem Phys 2012;135:065102. [PMID: 21842951 DOI: 10.1063/1.3615722] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Fan H, Schneidman-Duhovny D, Irwin JJ, Dong G, Shoichet BK, Sali A. Statistical potential for modeling and ranking of protein-ligand interactions. J Chem Inf Model 2011;51:3078-92. [PMID: 22014038 DOI: 10.1021/ci200377u] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Abstract

Applications in structural biology and medicinal chemistry require protein-ligand scoring functions for two distinct tasks: (i) ranking different poses of a small molecule in a protein binding site and (ii) ranking different small molecules by their complementarity to a protein site. Using probability theory, we developed two atomic distance-dependent statistical scoring functions: PoseScore was optimized for recognizing native binding geometries of ligands from other poses and RankScore was optimized for distinguishing ligands from nonbinding molecules. Both scores are based on a set of 8,885 crystallographic structures of protein-ligand complexes but differ in the values of three key parameters. Factors influencing the accuracy of scoring were investigated, including the maximal atomic distance and non-native ligand geometries used for scoring, as well as the use of protein models instead of crystallographic structures for training and testing the scoring function. For the test set of 19 targets, RankScore improved the ligand enrichment (logAUC) and early enrichment (EF(1)) scores computed by DOCK 3.6 for 13 and 14 targets, respectively. In addition, RankScore performed better at rescoring than each of seven other scoring functions tested. Accepting both the crystal structure and decoy geometries with all-atom root-mean-square errors of up to 2 Å from the crystal structure as correct binding poses, PoseScore gave the best score to a correct binding pose among 100 decoys for 88% of all cases in a benchmark set containing 100 protein-ligand complexes. PoseScore accuracy is comparable to that of DrugScore(CSD) and ITScore/SE and superior to 12 other tested scoring functions. Therefore, RankScore can facilitate ligand discovery, by ranking complexes of the target with different small molecules; PoseScore can be used for protein-ligand complex structure prediction, by ranking different conformations of a given protein-ligand pair. The statistical potentials are available through the Integrative Modeling Platform (IMP) software package (http://salilab.org/imp) and the LigScore Web server (http://salilab.org/ligscore/).

Collapse

Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011;79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]

Dong Q, Zhou S. Novel nonlinear knowledge-based mean force potentials based on machine learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:476-486. [PMID: 20820079 DOI: 10.1109/tcbb.2010.86] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Abstract

The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.

Collapse

Rykunov D, Fiser A. New statistical potential for quality assessment of protein models and a survey of energy functions. BMC Bioinformatics 2010;11:128. [PMID: 20226048 PMCID: PMC2853469 DOI: 10.1186/1471-2105-11-128] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 03/12/2010] [Indexed: 11/30/2022] Open

Buck PM, Bystroff C. Simulating protein folding initiation sites using an alpha-carbon-only knowledge-based force field. Proteins 2010;76:331-42. [PMID: 19137613 DOI: 10.1002/prot.22348] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

Protein folding is a hierarchical process where structure forms locally first, then globally. Some short sequence segments initiate folding through strong structural preferences that are independent of their three-dimensional context in proteins. We have constructed a knowledge-based force field in which the energy functions are conditional on local sequence patterns, as expressed in the hidden Markov model for local structure (HMMSTR). Carbon-alpha force field (CALF) builds sequence specific statistical potentials based on database frequencies for alpha-carbon virtual bond opening and dihedral angles, pair-wise contacts and hydrogen bond donor-acceptor pairs, and simulates folding via Brownian dynamics. We introduce hydrogen bond donor and acceptor potentials as alpha-carbon probability fields that are conditional on the predicted local sequence. Constant temperature simulations were carried out using 27 peptides selected as putative folding initiation sites, each 12 residues in length, representing several different local structure motifs. Each 0.6 micros trajectory was clustered based on structure. Simulation convergence or representativeness was assessed by subdividing trajectories and comparing clusters. For 21 of the 27 sequences, the largest cluster made up more than half of the total trajectory. Of these 21 sequences, 14 had cluster centers that were at most 2.6 A root mean square deviation (RMSD) from their native structure in the corresponding full-length protein. To assess the adequacy of the energy function on nonlocal interactions, 11 full length native structures were relaxed using Brownian dynamics simulations. Equilibrated structures deviated from their native states but retained their overall topology and compactness. A simple potential that folds proteins locally and stabilizes proteins globally may enable a more realistic understanding of hierarchical folding pathways.

Collapse

Subramani A, DiMaggio PA, Floudas CA. Selecting high quality protein structures from diverse conformational ensembles. Biophys J 2009;97:1728-36. [PMID: 19751678 DOI: 10.1016/j.bpj.2009.06.046] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2009] [Revised: 06/15/2009] [Accepted: 06/30/2009] [Indexed: 01/01/2023] Open

Vallat BK, Pillardy J, Májek P, Meller J, Blom T, Cao B, Elber R. Building and assessing atomic models of proteins from structural templates: learning and benchmarks. Proteins 2009;76:930-45. [PMID: 19326457 PMCID: PMC2719020 DOI: 10.1002/prot.22401] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Cohen M, Potapov V, Schreiber G. Four distances between pairs of amino acids provide a precise description of their interaction. PLoS Comput Biol 2009;5:e1000470. [PMID: 19680437 PMCID: PMC2715887 DOI: 10.1371/journal.pcbi.1000470] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2009] [Accepted: 07/15/2009] [Indexed: 11/18/2022] Open

Maupetit J, Tuffery P, Derreumaux P. A coarse-grained protein force field for folding and structure prediction. Proteins 2009;69:394-408. [PMID: 17600832 DOI: 10.1002/prot.21505] [Citation(s) in RCA: 164] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Betancourt MR. Another look at the conditions for the extraction of protein knowledge-based potentials. Proteins 2009;76:72-85. [PMID: 19089977 DOI: 10.1002/prot.22320] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

Protein knowledge-based potentials are effective free energies obtained from databases of known protein structures. They are used to parameterize coarse-grained protein models in many folding simulation and structure prediction methods. Two common approaches are used in the derivation of knowledge-based potentials. One assumes that the energy parameters optimize the native structure stability. The other assumes that interaction events are related to their energies according to the Boltzmann distribution, and that they are distributed independently of other events, that is, the quasi-chemical approximation. Here, these assumptions are systematically tested by extracting contact energies from artificial databases of lattice proteins with predefined pairwise contact energies. Databases of protein sequences are designed to either satisfy the Boltzmann distribution at high or low temperatures, or to simultaneously optimize the native stability and folding kinetics. It is found that the quasi-chemical approximation, with the ideal reference state, accurately reproduce the true energies for high temperature Boltzmann distributed sequences (weakly interacting residues), but less accurately at low temperatures, where the sequences correspond to energy minima and the residues are strongly interacting. To overcome this problem, an iterative procedure for Boltzmann distributed sequences is introduced, which accounts for interacting residue correlations and eliminates the need for the quasi-chemical approximation. In this case, the energies are accurately reproduced at any ensemble temperature. However, when the database of sequences designed for optimal stability and kinetics is used, the energy correlation is less than optimal using either method, exhibiting random and systematic deviations from linearity. Therefore, the assumption that native structures are maximally stable or that sequences are determined according to the Boltzmann distribution seems to be inadequate for obtaining accurate energies. The limited number of sequences in the database and the inhomogeneous concentration of amino acids from one structure to another do not seem to be major obstacles for improving the quality of the extracted pairwise energies, with the exception of repulsive interactions.

Collapse

Gu J, Li H, Jiang H, Wang X. A simple Calpha-SC potential with higher accuracy for protein fold recognition. Biochem Biophys Res Commun 2009;379:610-5. [PMID: 19121621 DOI: 10.1016/j.bbrc.2008.12.131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2008] [Accepted: 12/20/2008] [Indexed: 11/18/2022]

Felts AK, Gallicchio E, Chekmarev D, Paris KA, Friesner RA, Levy RM. Prediction of Protein Loop Conformations using the AGBNP Implicit Solvent Model and Torsion Angle Sampling. J Chem Theory Comput 2008;4:855-868. [PMID: 18787648 DOI: 10.1021/ct800051k] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Rajgaria R, McAllister SR, Floudas CA. Distance dependent centroid to centroid force fields using high resolution decoys. Proteins 2008;70:950-70. [PMID: 17847088 DOI: 10.1002/prot.21561] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Lee SY, Skolnick J. Development and benchmarking of TASSER(iter) for the iterative improvement of protein structure predictions. Proteins 2007;68:39-47. [PMID: 17469193 DOI: 10.1002/prot.21440] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Abstract

To improve the accuracy of TASSER models especially in the limit where threading provided template alignments are of poor quality, we have developed the TASSER(iter) algorithm which uses the templates and contact restraints from TASSER generated models for iterative structure refinement. We apply TASSER(iter) to a large benchmark set of 2,773 nonhomologous single domain proteins that are < or = 200 in length and that cover the PDB at the level of 35% pairwise sequence identity. Overall, TASSER(iter) models have a smaller global average RMSD of 5.48 A compared to 5.81 A RMSD of the original TASSER models. Classifying the targets by the level of prediction difficulty (where Easy targets have a good template with a corresponding good threading alignment, Medium targets have a good template but a poor alignment, and Hard targets have an incorrectly identified template), TASSER(iter) (TASSER) models have an average RMSD of 4.15 A (4.35 A) for the Easy set and 9.05 A (9.52 A) for the Hard set. The largest reduction of average RMSD is for the Medium set where the TASSER(iter) models have an average global RMSD of 5.67 A compared to 6.72 A of the TASSER models. Seventy percent of the Medium set TASSER(iter) models have a smaller RMSD than the TASSER models, while 63% of the Easy and 60% of the Hard TASSER models are improved by TASSER(iter). For the foldable cases, where the targets have a RMSD to the native <6.5 A, TASSER(iter) shows obvious improvement over TASSER models: For the Medium set, it improves the success rate from 57.0 to 67.2%, followed by the Hard targets where the success rate improves from 32.0 to 34.8%, with the smallest improvement in the Easy targets from 82.6 to 84.0%. These results suggest that TASSER(iter) can provide more reliable predictions for targets of Medium difficulty, a range that had resisted improvement in the quality of protein structure predictions.

Collapse

Qiu J, Sheffler W, Baker D, Noble WS. Ranking predicted protein structures with support vector regression. Proteins 2007;71:1175-82. [PMID: 18004754 DOI: 10.1002/prot.21809] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Wu Y, Lu M, Chen M, Li J, Ma J. OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. Protein Sci 2007;16:1449-63. [PMID: 17586777 PMCID: PMC2206690 DOI: 10.1110/ps.072796107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Cheng J, Pei J, Lai L. A free-rotating and self-avoiding chain model for deriving statistical potentials based on protein structures. Biophys J 2007;92:3868-77. [PMID: 17351015 PMCID: PMC1868969 DOI: 10.1529/biophysj.106.102152] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci 2007;15:2507-24. [PMID: 17075131 PMCID: PMC2242414 DOI: 10.1110/ps.062416606] [Citation(s) in RCA: 1765] [Impact Index Per Article: 103.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Dong Q, Wang X, Lin L. Novel knowledge-based mean force potential at the profile level. BMC Bioinformatics 2006;7:324. [PMID: 16803615 PMCID: PMC1534065 DOI: 10.1186/1471-2105-7-324] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2006] [Accepted: 06/27/2006] [Indexed: 11/10/2022] Open

Abstract

Background

The development and testing of functions for the modeling of protein energetics is an important part of current research aimed at understanding protein structure and function. Knowledge-based mean force potentials are derived from statistical analyses of interacting groups in experimentally determined protein structures. Current knowledge-based mean force potentials are developed at the atom or amino acid level. The evolutionary information contained in the profiles is not investigated. Based on these observations, a class of novel knowledge-based mean force potentials at the profile level has been presented, which uses the evolutionary information of profiles for developing more powerful statistical potentials.

Results

The frequency profiles are directly calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into binary profiles with a probability threshold. As a result, the protein sequences are represented as sequences of binary profiles rather than sequences of amino acids. Similar to the knowledge-based potentials at the residue level, a class of novel potentials at the profile level is introduced. We develop four types of profile-level statistical potentials including distance-dependent, contact, Φ/Ψ dihedral angle and accessible surface statistical potentials. These potentials are first evaluated by the fold assessment between the correct and incorrect models generated by comparative modeling from our own and other groups. They are then used to recognize the native structures from well-constructed decoy sets. Experimental results show that all the knowledge-base mean force potentials at the profile level outperform those at the residue level. Significant improvements are obtained for the distance-dependent and accessible surface potentials (5–6%). The contact and Φ/Ψ dihedral angle potential only get a slight improvement (1–2%). Decoy set evaluation results show that the distance-dependent profile-level potentials even outperform other atom-level potentials. We also demonstrate that profile-level statistical potentials can improve the performance of threading.

Conclusion

The knowledge-base mean force potentials at the profile level can provide better discriminatory ability than those at the residue level, so they will be useful for protein structure prediction and model refinement.

Collapse

Skolnick J. In quest of an empirical potential for protein structure prediction. Curr Opin Struct Biol 2006;16:166-71. [PMID: 16524716 DOI: 10.1016/j.sbi.2006.02.004] [Citation(s) in RCA: 112] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Revised: 02/10/2006] [Accepted: 02/23/2006] [Indexed: 11/19/2022]