1
|
All-Atom Four-Body Knowledge-Based Statistical Potentials to Distinguish Native Protein Structures from Nonnative Folds. BIOMED RESEARCH INTERNATIONAL 2017; 2017:5760612. [PMID: 29119109 PMCID: PMC5651141 DOI: 10.1155/2017/5760612] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 08/13/2017] [Accepted: 08/23/2017] [Indexed: 02/05/2023]
Abstract
Recent advances in understanding protein folding have benefitted from coarse-grained representations of protein structures. Empirical energy functions derived from these techniques occasionally succeed in distinguishing native structures from their corresponding ensembles of nonnative folds or decoys which display varying degrees of structural dissimilarity to the native proteins. Here we utilized atomic coordinates of single protein chains, comprising a large diverse training set, to develop and evaluate twelve all-atom four-body statistical potentials obtained by exploring alternative values for a pair of inherent parameters. Delaunay tessellation was performed on the atomic coordinates of each protein to objectively identify all quadruplets of interacting atoms, and atomic potentials were generated via statistical analysis of the data and implementation of the inverted Boltzmann principle. Our potentials were evaluated using benchmarking datasets from Decoys-‘R'-Us, and comparisons were made with twelve other physics- and knowledge-based potentials. Ranking 3rd, our best potential tied CHARMM19 and surpassed AMBER force field potentials. We illustrate how a generalized version of our potential can be used to empirically calculate binding energies for target-ligand complexes, using HIV-1 protease-inhibitor complexes for a practical application. The combined results suggest an accurate and efficient atomic four-body statistical potential for protein structure prediction and assessment.
Collapse
|
2
|
Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev 2016; 116:7898-936. [DOI: 10.1021/acs.chemrev.6b00163] [Citation(s) in RCA: 555] [Impact Index Per Article: 69.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sebastian Kmiecik
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Bioinformatics
Laboratory, Mossakowski Medical Research Center of the Polish Academy of Sciences, Pawinskiego 5, 02-106 Warsaw, Poland
| | - Lukasz Wieteska
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
- Department
of Medical Biochemistry, Medical University of Lodz, Mazowiecka 6/8, 92-215 Lodz, Poland
| | | | - Andrzej Kolinski
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
3
|
SAKAE YOSHITAKE, OKAMOTO YUKO. PROTEIN FORCE-FIELD PARAMETERS OPTIMIZED WITH THE PROTEIN DATA BANK I: FORCE-FIELD OPTIMIZATIONS. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2011. [DOI: 10.1142/s0219633604001082] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We optimized five existing sets of force-field parameters for protein systems by our recently proposed method. The five force fields are AMBER parm94, AMBER parm96, AMBER parm99, CHARMM version 22, and OPLS-AA. The method consists of minimizing the sum of the square of the force acting on each atom in the proteins with the structures from the Protein Data Bank (PDB). We selected the partial-charge and backbone torsion-energy parameters for this optimization, and 100 molecules from the PDB were used. We gave detailed comparisons of the optimized force fields and found that there is a tendency of convergence towards the same function for the torsion-energy term.
Collapse
Affiliation(s)
- YOSHITAKE SAKAE
- Department of Functional Molecular Science, The Graduate University for Advanced Studies, Okazaki, Aichi 444-8585, Japan
- Department of Theoretical Studies, Institute for Molecular Science, Okazaki, Aichi 444-8585, Japan
| | - YUKO OKAMOTO
- Department of Functional Molecular Science, The Graduate University for Advanced Studies, Okazaki, Aichi 444-8585, Japan
- Department of Theoretical Studies, Institute for Molecular Science, Okazaki, Aichi 444-8585, Japan
| |
Collapse
|
4
|
Zhang J, Chen R, Liang J. Potential function of simplified protein models for discriminating native proteins from decoys: combining contact interaction and local sequence-dependent geometry. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2004:2976-9. [PMID: 17270903 DOI: 10.1109/iembs.2004.1403844] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
An effective potential function is critical for protein structure prediction and folding simulation. For simplified models of proteins where coordinates of only Ca atoms need to be specified, an accurate potential function is important. Such a simplified model is essential for efficient search of conformational space. In this work, we present a formulation of potential function for simplified representations of protein structures. It is based on the combination of descriptors derived from residue-residue contact and sequence-dependent local geometry. The optimal weight coefficients for contact and local geometry is obtained through optimization by maximizing margins among native and decoy structures. The latter are generated by chain growth and by gapless threading. The performance of the potential function in blind test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. This potential function have comparable or better performance than several residue-based potential functions that require in addition coordinates of side chain centers or coordinates of all side chain atoms.
Collapse
|
5
|
Zhang J, Chen R, Liang J. Empirical potential function for simplified protein models: combining contact and local sequence-structure descriptors. Proteins 2006; 63:949-60. [PMID: 16477624 DOI: 10.1002/prot.20809] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only Calpha or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue-specific reduced discrete-state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in a test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or Calpha atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side-chain centers or coordinates of all side-chain atoms. By reducing the residue alphabets down to size 10 for contact descriptors, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Bioengineering, University of Illinois, Chicago, Illinois, USA
| | | | | |
Collapse
|
6
|
Li X, Liang J. Geometric cooperativity and anticooperativity of three-body interactions in native proteins. Proteins 2005; 60:46-65. [PMID: 15849756 DOI: 10.1002/prot.20438] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Characterizing multibody interactions of hydrophobic, polar, and ionizable residues in protein is important for understanding the stability of protein structures. We introduce a geometric model for quantifying 3-body interactions in native proteins. With this model, empirical propensity values for many types of 3-body interactions can be reliably estimated from a database of native protein structures, despite the overwhelming presence of pairwise contacts. In addition, we define a nonadditive coefficient that characterizes cooperativity and anticooperativity of residue interactions in native proteins by measuring the deviation of 3-body interactions from 3 independent pairwise interactions. It compares the 3-body propensity value from what would be expected if only pairwise interactions were considered, and highlights the distinction of propensity and cooperativity of 3-body interaction. Based on the geometric model, and what can be inferred from statistical analysis of such a model, we find that hydrophobic interactions and hydrogen-bonding interactions make nonadditive contributions to protein stability, but the nonadditive nature depends on whether such interactions are located in the protein interior or on the protein surface. When located in the interior, many hydrophobic interactions such as those involving alkyl residues are anticooperative. Salt-bridge and regular hydrogen-bonding interactions, such as those involving ionizable residues and polar residues, are cooperative. When located on the protein surface, these salt-bridge and regular hydrogen-bonding interactions are anticooperative, and hydrophobic interactions involving alkyl residues become cooperative. We show with examples that incorporating 3-body interactions improves discrimination of protein native structures against decoy conformations. In addition, analysis of cooperative 3-body interaction may reveal spatial motifs that can suggest specific protein functions.
Collapse
Affiliation(s)
- Xiang Li
- Department of Bioengineering, SEO, MC-063, University of Illinois at Chicago, Chicago, Illinois 60607-7052, USA
| | | |
Collapse
|
7
|
|
8
|
|
9
|
Li X, Hu C, Liang J. Simplicial edge representation of protein structures and alpha contact potential with confidence measure. Proteins 2003; 53:792-805. [PMID: 14635122 DOI: 10.1002/prot.10442] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Protein representation and potential function are two important ingredients for studying protein folding, equilibrium thermodynamics, and sequence design. We introduce a novel geometric representation of protein contact interactions using the edge simplices from the alpha shape of the protein structure. This representation can eliminate implausible neighbors that are not in physical contact, and can avoid spurious contact between two residues when a third residue is between them. We developed statistical alpha contact potential using an odds-ratio model. A studentized bootstrap method was then introduced to assess the 95% confidence intervals for each of the 210 propensity parameters. We found, with confidence, that there is significant long-range propensity (>30 residues apart) for hydrophobic interactions. We tested alpha contact potential for native structure discrimination using several sets of decoy structures, and found that it often performs comparably with atom-based potentials requiring many more parameters. We also show that accurate geometric representation is important, and that alpha contact potential has better performance than potential defined by cutoff distance between geometric centers of side chains. Hierarchical clustering of alpha contact potentials reveals natural grouping of residues. To explore the relationship between shape and physicochemical representations, we tested the minimum alphabet size necessary for native structure discrimination. We found that there is no significant difference in performance of discrimination when alphabet size varies from 7 to 20, if geometry is represented accurately by alpha simplicial edges. This result suggests that the geometry of packing plays an important role, but the specific residue types are often interchangeable.
Collapse
Affiliation(s)
- Xiang Li
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60607-7052, USA
| | | | | |
Collapse
|
10
|
Eastwood MP, Hardin C, Luthey-Schulten Z, Wolynes PG. Statistical mechanical refinement of protein structure prediction schemes. II. Mayer cluster expansion approach. J Chem Phys 2003. [DOI: 10.1063/1.1565106] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
11
|
McConkey BJ, Sobolev V, Edelman M. Discrimination of native protein structures using atom-atom contact scoring. Proc Natl Acad Sci U S A 2003; 100:3215-20. [PMID: 12631702 PMCID: PMC152272 DOI: 10.1073/pnas.0535768100] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We introduce a method for discriminating correctly folded proteins from well designed decoy structures using atom-atom and atom-solvent contact surfaces. The measure used to quantify contact surfaces integrates the solvent accessible surface and interatomic contacts into one quantity, allowing solvent to be treated as an atom contact. A scoring function was derived from statistical contact preferences within known protein structures and validated by using established protein decoy sets, including the "Rosetta" decoys and data from the CASP4 structure predictions. The scoring function effectively distinguished native structures from all corresponding decoys in >90% of the cases, using isolated protein subunits as target structures. If contacts between subunits within quaternary structures are included, the accuracy increases to 97%. Interactions beyond atom-atom contact range were not required to distinguish native structures from the decoys using this method. The contact scoring performed as well or better than existing statistical and physicochemical potentials and may be applied as an independent means of evaluating putative structural models.
Collapse
Affiliation(s)
- Brendan J McConkey
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot 76100, Israel.
| | | | | |
Collapse
|
12
|
Eastwood MP, Hardin C, Luthey-Schulten Z, Wolynes PG. Statistical mechanical refinement of protein structure prediction schemes: Cumulant expansion approach. J Chem Phys 2002. [DOI: 10.1063/1.1494417] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
13
|
Fain B, Xia Y, Levitt M. Design of an optimal Chebyshev-expanded discrimination function for globular proteins. Protein Sci 2002; 11:2010-21. [PMID: 12142455 PMCID: PMC2373672 DOI: 10.1110/ps.0200702] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We describe the construction of a scoring function designed to model the free energy of protein folding. An optimization technique is used to determine the best functional forms of the hydrophobic, residue-residue and hydrogen-bonding components of the potential. The scoring function is expanded by use of Chebyshev polynomials, the coefficients of which are determined by minimizing the score, in units of standard deviation, of native structures in the ensembles of alternate decoy conformations. The derived effective potential is then tested on decoy sets used conventionally in such studies. Using our scoring function, we achieve a high level of discrimination between correct and incorrect folds. In addition, our method is able to represent functions of arbitrary shape with fewer parameters than the usual histogram potentials of similar resolution. Finally, our representation can be combined easily with many optimization methods, because the total energy is a linear function of the parameters. Our results show that the techniques of Z-score optimization and Chebyshev expansion work well.
Collapse
Affiliation(s)
- Boris Fain
- Department of Structural Biology, Stanford University, Stanford University School of Medicine, California 94305, USA.
| | | | | |
Collapse
|
14
|
Abstract
Multiple sequence alignments are a routine tool in protein fold recognition, but multiple structure alignments are computationally less cooperative. This work describes a method for protein sequence threading and sequence-to-structure alignments that uses multiple aligned structures, the aim being to improve models from protein threading calculations. Sequences are aligned into a field due to corresponding sites in homologous proteins. On the basis of a test set of more than 570 protein pairs, the procedure does improve alignment quality, although no more than averaging over sequences. For the force field tested, the benefit of structure averaging is smaller than that of adding sequence similarity terms or a contribution from secondary structure predictions. Although there is a significant improvement in the quality of sequence-to-structure alignments, this does not directly translate to an immediate improvement in fold recognition capability.
Collapse
Affiliation(s)
- Anthony J Russell
- Research School of Chemistry, Australian National University, Canberra, Australia
| | | |
Collapse
|
15
|
|
16
|
Bastolla U, Farwer J, Knapp EW, Vendruscolo M. How to guarantee optimal stability for most representative structures in the Protein Data Bank. Proteins 2001; 44:79-96. [PMID: 11391771 DOI: 10.1002/prot.1075] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We proposed recently an optimization method to derive energy parameters for simplified models of protein folding. The method is based on the maximization of the thermodynamic average of the overlap between protein native structures and a Boltzmann ensemble of alternative structures. Such a condition enforces protein models whose ground states are most similar to the corresponding native states. We present here an extensive testing of the method for a simple residue-residue contact energy function and for alternative structures generated by threading. The optimized energy function guarantees high stability and a well-correlated energy landscape to most representative structures in the PDB database. Failures in the recognition of the native structure can be attributed to the neglect of interactions between different chains in oligomeric proteins or with cofactors. When these are taken into account, only very few X-ray structures are not recognized. Most of them are short inhibitors or fragments and one is a structure that presents serious inconsistencies. Finally, we discuss the reasons that make NMR structures more difficult to recognizeCopyright 2001 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- U Bastolla
- Free University of Berlin, Department of Biology, Chemistry and Pharmacy, Institute of Chemistry, Berlin, Germany.
| | | | | | | |
Collapse
|