1
|
Huang SY, Zou X. ITScorePro: an efficient scoring program for evaluating the energy scores of protein structures for structure prediction. Methods Mol Biol 2014; 1137:71-81. [PMID: 24573475 PMCID: PMC11121506 DOI: 10.1007/978-1-4939-0366-5_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
One important component in protein structure prediction is to evaluate the free energy of a given conformation. Given the enormous number of possible conformations for a sequence, it is extremely challenging to quickly and accurately score the energies of these conformations and predict a reasonable structure within a practical computational time. Here, we describe an efficient program for energy evaluation, referred to as ITScorePro (Copyright © 2012). The energy scoring function in the ITScorePro program is based on the distance-dependent, pairwise atomic potentials for protein structure prediction that we recently derived by using statistical mechanics principles (Huang and Zou, Proteins 79:2648-2661, 2011). ITScorePro is a stand-alone program and can also be easily implemented in other software suites for protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Dalton Cardiovascular Research Center, Informatics Institute, University of Missouri, Columbia, MO, USA
| | | |
Collapse
|
2
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
3
|
Huang SY, Grinter SZ, Zou X. Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions. Phys Chem Chem Phys 2010; 12:12899-908. [PMID: 20730182 PMCID: PMC11103779 DOI: 10.1039/c0cp00151a] [Citation(s) in RCA: 294] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The scoring function is one of the most important components in structure-based drug design. Despite considerable success, accurate and rapid prediction of protein-ligand interactions is still a challenge in molecular docking. In this perspective, we have reviewed three basic types of scoring functions (force-field, empirical, and knowledge-based) and the consensus scoring technique that are used for protein-ligand docking. The commonly-used assessment criteria and publicly available protein-ligand databases for performance evaluation of the scoring functions have also been presented and discussed. We end with a discussion of the challenges faced by existing scoring functions and possible future directions for developing improved scoring functions.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| | - Sam Z. Grinter
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute University of Missouri Columbia, MO 65211
| |
Collapse
|
4
|
Abstract
In this article, we explore the information content of molecular force-field calculations. We make use of exhaustive lattice models of molecular conformations and reduced alphabet sequences to determine the relative resolving power of pairwise interaction-based force fields. We find that sequence-specific interactions that operate over longer distances offer greater amounts of information than nearest-neighbor or non-sequence-specific interactions. In a companion article in this issue, we explored the information content of sequence alignment procedures and the calculation of gap penalties. Both articles have implications for protein and nucleic-acid computations.
Collapse
Affiliation(s)
- Tiba Aynechi
- Graduate Group in Biophysics, and Department of Pharmaceutical Chemistry, University of California-San Francisco, San Francisco, CA 94143, USA
| | | |
Collapse
|
5
|
Torda AE, Procter JB, Huber T. Wurst: a protein threading server with a structural scoring function, sequence profiles and optimized substitution matrices. Nucleic Acids Res 2004; 32:W532-5. [PMID: 15215443 PMCID: PMC441495 DOI: 10.1093/nar/gkh357] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Wurst is a protein threading program with an emphasis on high quality sequence to structure alignments (http://www.zbh.uni-hamburg.de/wurst). Submitted sequences are aligned to each of about 3000 templates with a conventional dynamic programming algorithm, but using a score function with sophisticated structure and sequence terms. The structure terms are a log-odds probability of sequence to structure fragment compatibility, obtained from a Bayesian classification procedure. A simplex optimization was used to optimize the sequence-based terms for the goal of alignment and model quality and to balance the sequence and structural contributions against each other. Both sequence and structural terms operate with sequence profiles.
Collapse
Affiliation(s)
- Andrew E Torda
- University of Hamburg, Zentrum für Bioinformatik, Bundesstrasse 43, D-20146 Hamburg, Germany
| | | | | |
Collapse
|
6
|
Seok C, Rosen JB, Chodera JD, Dill KA. MOPED: method for optimizing physical energy parameters using decoys. J Comput Chem 2003; 24:89-97. [PMID: 12483678 DOI: 10.1002/jcc.10124] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We present a method called MOPED for optimizing energetic and structural parameters in computational models, including all-atom energy functions, when native structures and decoys are given. The present method goes beyond previous approaches in treating energy functions that are nonlinear in the parameters and continuous in the degrees of freedom. We illustrate the method by improving solvation parameters in the energy function EEF1, which consists of the CHARMM19 polar hydrogen force field augmented by a Gaussian solvation term. Although the published parameters for EEF1 correctly discriminate the native from decoys in the decoy sets of Levitt et al., they fail on several of the more difficult decoy sets of Baker et al. MOPED successfully finds improved parameters that allow EEF1 to discriminate native from decoy structures on all protein structures that do not have metals or prosthetic groups.
Collapse
Affiliation(s)
- Chaok Seok
- Department of Pharmaceutical Chemistry, University of California in San Francisco, San Francisco, California 94118, USA
| | | | | | | |
Collapse
|
7
|
Ayroldi E, Zollo O, Macchiarulo A, Di Marco B, Marchetti C, Riccardi C. Glucocorticoid-induced leucine zipper inhibits the Raf-extracellular signal-regulated kinase pathway by binding to Raf-1. Mol Cell Biol 2002; 22:7929-41. [PMID: 12391160 PMCID: PMC134721 DOI: 10.1128/mcb.22.22.7929-7941.2002] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Glucocorticoid-induced leucine zipper (GILZ) is a leucine zipper protein, whose expression is augmented by dexamethasone (DEX) treatment and downregulated by T-cell receptor (TCR) triggering. Stable expression of GILZ in T cells mimics some of the effects of glucocorticoid hormones (GCH) in GCH-mediated immunosuppressive and anti-inflammatory activity. In fact, GILZ overexpression inhibits TCR-activated NF-kappaB nuclear translocation, interleukin-2 production, FasL upregulation, and the consequent activation-induced apoptosis. We have investigated the molecular mechanism underlying GILZ-mediated regulation of T-cell activation by analyzing the effects of GILZ on the activity of mitogen-activated protein kinase (MAPK) family members, including Raf, MAPK/extracellular signal-regulated kinase (ERK) 1/2 (MEK-1/2), ERK-1/2, and c-Jun NH(2)-terminal protein kinase (JNK). Our results indicate that GILZ inhibited Raf-1 phosphorylation, which resulted in the suppression of both MEK/ERK-1/2 phosphorylation and AP-1-dependent transcription. We demonstrate that GILZ interacts in vitro and in vivo with endogenous Raf-1 and that Raf-1 coimmunoprecipitated with GILZ in murine thymocytes treated with DEX. Mapping of the binding domains and experiments with GILZ mutants showed that GILZ binds the region of Raf interacting with Ras through the NH(2)-terminal region. These data suggest that GILZ contributes, through protein-to-protein interaction with Raf-1 and the consequent inhibition of Raf-MEK-ERK activation, to regulating the MAPK pathway and to providing a further mechanism underlying GCH immunosuppression.
Collapse
Affiliation(s)
- Emira Ayroldi
- Department of Clinical and Experimental Medicine, Section of Pharmacology. Department of Drug Chemistry and Technology, University of Perugia, 06100 Perugia, Italy
| | | | | | | | | | | |
Collapse
|
8
|
Abstract
Multiple sequence alignments are a routine tool in protein fold recognition, but multiple structure alignments are computationally less cooperative. This work describes a method for protein sequence threading and sequence-to-structure alignments that uses multiple aligned structures, the aim being to improve models from protein threading calculations. Sequences are aligned into a field due to corresponding sites in homologous proteins. On the basis of a test set of more than 570 protein pairs, the procedure does improve alignment quality, although no more than averaging over sequences. For the force field tested, the benefit of structure averaging is smaller than that of adding sequence similarity terms or a contribution from secondary structure predictions. Although there is a significant improvement in the quality of sequence-to-structure alignments, this does not directly translate to an immediate improvement in fold recognition capability.
Collapse
Affiliation(s)
- Anthony J Russell
- Research School of Chemistry, Australian National University, Canberra, Australia
| | | |
Collapse
|
9
|
Abstract
A protein folding potential function ideally has several properties: it favors the native conformations for a number of protein sequences over a variety of nonnative folds; it can guide the search over conformations for the native state; it reflects changes in stability of the native fold due to changes in sequence; and it is relatively insensitive to small changes in conformation. While these are not mutually incompatible goals, attaining one property does not ensure that the others are satisfied. Examples are given of simple potentials having one property but lacking others. A new functional form of a folding potential is described where interactions between all nonhydrogen atoms are used to estimate interresidue interactions and implicit solvation. Its parameters can be adjusted to satisfy the above properties at least for barnase and a few other proteins.
Collapse
Affiliation(s)
- G M Crippen
- College of Pharmacy, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
10
|
Abstract
Models in computational biology, such as those used in binding, docking, and folding, are often empirical and have adjustable parameters. Because few of these models are yet fully predictive, the problem may be nonoptimal choices of parameters. We describe an algorithm called ENPOP (energy function parameter optimization) that improves-and sometimes optimizes-the parameters for any given model and for any given search strategy that identifies the stable state of that model. ENPOP iteratively adjusts the parameters simultaneously to move the model global minimum energy conformation for each of m different molecules as close as possible to the true native conformations, based on some appropriate measure of structural error. A proof of principle is given for two very different test problems. The first involves three different two-dimensional model protein molecules having 12 to 37 monomers and four parameters in common. The parameters converge to the values used to design the model native structures. The second problem involves nine bumpy landscapes, each having between 4 and 12 degrees of freedom. For the three adjustable parameters, the globally optimal values are known in advance. ENPOP converges quickly to the correct parameter set.
Collapse
Affiliation(s)
- J B Rosen
- Computer Science and Engineering Department, University of California at San Diego, San Diego, California 92093 USA
| | | | | | | |
Collapse
|
11
|
Dombkowski AA, Crippen GM. Disulfide recognition in an optimized threading potential. PROTEIN ENGINEERING 2000; 13:679-89. [PMID: 11112506 DOI: 10.1093/protein/13.10.679] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
An energy potential is constructed and trained to succeed in fold recognition for the general population of proteins as well as an important class which has previously been problematic: small, disulfide-bearing proteins. The potential is modeled on solvation, with the energy a function of side chain burial and the number of disulfide bonds. An accurate disulfide recognition algorithm identifies cysteine pairs which have the appropriate orientation to form a disulfide bridge. The potential has 22 energy parameters which are optimized so the Protein Data Bank (PDB) structure for each sequence in a training set is the lowest in energy out of thousands of alternative structures. One parameter per amino acid type reflects burial preference and a single parameter is used in an overpacking term. Additionally, one optimized parameter provides a favorable contribution for each disulfide identified in a given protein structure. With little training, the potential is >80% accurate in ungapped threading tests using a variety of proteins. The same level of accuracy is observed in a threading test of small proteins which have disulfide bonds. Importantly, the energy potential is also successful with proteins having uncrosslinked cysteines.
Collapse
Affiliation(s)
- A A Dombkowski
- College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, USA
| | | |
Collapse
|
12
|
|
13
|
|
14
|
Ayers DJ, Gooley PR, Widmer-Cooper A, Torda AE. Enhanced protein fold recognition using secondary structure information from NMR. Protein Sci 1999; 8:1127-33. [PMID: 10338023 PMCID: PMC2144327 DOI: 10.1110/ps.8.5.1127] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
NMR offers the possibility of accurate secondary structure for proteins that would be too large for structure determination. In the absence of an X-ray crystal structure, this information should be useful as an adjunct to protein fold recognition methods based on low resolution force fields. The value of this information has been tested by adding varying amounts of artificial secondary structure data and threading a sequence through a library of candidate folds. Using a literature test set, the threading method alone has only a one-third chance of producing a correct answer among the top ten guesses. With realistic secondary structure information, one can expect a 60-80% chance of finding a homologous structure. The method has then been applied to examples with published estimates of secondary structure. This implementation is completely independent of sequence homology, and sequences are optimally aligned to candidate structures with gaps and insertions allowed. Unlike work using predicted secondary structure, we test the effect of differing amounts of relatively reliable data.
Collapse
Affiliation(s)
- D J Ayers
- Research School of Chemistry, Australian National University, Canberra ACT
| | | | | | | |
Collapse
|
15
|
Orengo C, Bray J, Hubbard T, LoConte L, Sillitoe I. Analysis and assessment of ab initio three-dimensional prediction, secondary structure, and contacts prediction. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(1999)37:3+<149::aid-prot20>3.0.co;2-h] [Citation(s) in RCA: 85] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
16
|
Keskin O, Bahar I, Badretdinov AY, Ptitsyn OB, Jernigan RL. Empirical solvent-mediated potentials hold for both intra-molecular and inter-molecular inter-residue interactions. Protein Sci 1998; 7:2578-86. [PMID: 9865952 PMCID: PMC2143898 DOI: 10.1002/pro.5560071211] [Citation(s) in RCA: 72] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Whether knowledge-based intra-molecular inter-residue potentials are valid to represent inter-molecular interactions taking place at protein-protein interfaces has been questioned in several studies. Differences in the chain connectivity effect and in residue packing geometry between interfaces and single chain monomers have been pointed out as possible sources of distinct energetics for the two cases. In the present study, the interfacial regions of protein-protein complexes are examined to extract inter-molecular inter-residue potentials, using the same statistical methods as those previously adopted for intra-molecular residue pairs. Two sets of energy parameters are derived, corresponding to solvent-mediation and "average residue" mediation. The former set is shown to be highly correlated (correlation coefficient 0.89) with that previously obtained for inter-residue interactions within single chain monomers, while the latter exhibits a weaker correlation (0.69) with its intra-molecular counterpart. In addition to the close similarity of intra- and inter-molecular solvent-mediated potentials, they are shown to be significantly more residue-specific and thereby discriminative compared to the residue-mediated ones, indicating that solvent-mediation plays a major role in controlling the effective inter-residue interactions, either at interfaces, or within single monomers. Based on this observation, a reduced set of energy parameters comprising 20 one-body and 3 two-body terms is proposed (as opposed to the 20 x 20 tables of inter-residue potentials), which reproduces the conventional 20 x 20 tables with a correlation coefficient of 0.99.
Collapse
Affiliation(s)
- O Keskin
- Chemical Engineering Department & Polymer Research Center, Bogazici University, and TUBITAK Advanced Polymeric Materials Research Center, Istanbul, Turkey
| | | | | | | | | |
Collapse
|
17
|
Abstract
Genome sequencing projects continue to provide a flood of new protein sequences, and prediction methods remain an important means of adding structural information. Recently, there have been advances in secondary structure prediction, which feed, in turn, into improved fold recognition algorithms. Finally, there have been technical improvements in comparative modelling, and studies of the expected accuracy of three-dimensional structural models built by this method.
Collapse
Affiliation(s)
- D R Westhead
- The European Bioinformatics Institute EMBL Outstation Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK.
| | | |
Collapse
|