1
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
2
|
Abstract
Much of the biochemistry that underlies health, medicine, and numerous biotechnology applications is regulated by proteins, whereby the ability of proteins to effect such processes is dictated by the three-dimensional structural assembly of the proteins. Thus, a detailed understanding of biochemistry requires not only knowledge of the constituent sequence of proteins, but also a detailed understanding of how that sequence folds spatially. Three-dimensional analysis of protein structures is thus proving to be a critical mode of biological and medical discovery in the early twenty-first century, providing fundamental insight into function that produces useful biochemistry and dysfunction that leads to disease. The large number of distinct proteins precludes rigorous laboratory characterization of the complete structural proteome, but fortunately efficient in silico structure prediction is possible for many proteins that have not been experimentally characterized. One technique that continues to provide accurate and efficient protein structure predictions, called comparative modeling, has become a critical tool in many biological disciplines. The discussion herein is an updated version of a previous 2008 treatise focusing on the general philosophy of comparative modeling methods and on specific strategies for successfully achieving reliable and accurate models. The chapter discusses basic aspects of template selection, sequence alignment, spatial alignment, loop and gap modeling, side chain modeling, structural refinement and validation, and provides an important new discussion on automated computational tools for protein structure prediction.
Collapse
|
3
|
|
4
|
Identification, mRNA expression and characterization of a novel ANK-like gene from Chinese mitten crab Eriocheir japonica sinensis. Comp Biochem Physiol B Biochem Mol Biol 2009; 153:332-9. [DOI: 10.1016/j.cbpb.2009.04.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2009] [Revised: 04/14/2009] [Accepted: 04/15/2009] [Indexed: 12/26/2022]
|
5
|
Borrelli KW, Cossins B, Guallar V. Exploring hierarchical refinement techniques for induced fit docking with protein and ligand flexibility. J Comput Chem 2009; 31:1224-35. [DOI: 10.1002/jcc.21409] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
6
|
Vila JA, Scheraga HA. Factors affecting the use of 13C(alpha) chemical shifts to determine, refine, and validate protein structures. Proteins 2008; 71:641-54. [PMID: 17975838 DOI: 10.1002/prot.21726] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Interest centers here on the analysis of two different, but related, phenomena that affect side-chain conformations and consequently 13C(alpha) chemical shifts and their applications to determine, refine, and validate protein structures. The first is whether 13C(alpha) chemical shifts, computed at the DFT level of approximation with charged residues is a better approximation of observed 13C(alpha) chemical shifts than those computed with neutral residues for proteins in solution. Accurate computation of 13C(alpha) chemical shifts requires a proper representation of the charges, which might not take on integral values. For this analysis, the charges for 139 conformations of the protein ubiquitin were determined by explicit consideration of protein binding equilibria, at a given pH, that is, by exploring the 2(xi) possible ionization states of the whole molecule, with xi being the number of ionizable groups. The results of this analysis, as revealed by the shielding/deshielding of the 13C(alpha) nucleus, indicated that: (i) there is a significant difference in the computed 13C(alpha) chemical shifts, between basic and acidic groups, as a function of the degree of charge of the side chain; (ii) this difference is attributed to the distance between the ionizable groups and the 13C(alpha) nucleus, which is shorter for the acidic Asp and Glu groups as compared with that for the basic Lys and Arg groups; and (iii) the use of neutral, rather than charged, basic and acidic groups is a better approximation of the observed 13C(alpha) chemical shifts of a protein in solution. The second is how side-chain flexibility influences computed 13C(alpha) chemical shifts in an additional set of ubiquitin conformations, in which the side chains are generated from an NMR-derived structure with the backbone conformation assumed to be fixed. The 13C(alpha) chemical shift of a given amino acid residue in a protein is determined, mainly, by its own backbone and side-chain torsional angles, independent of the neighboring residues; the conformation of a given residue itself, however, depends on the environment of this residue and, hence, on the whole protein structure. As a consequence, this analysis reveals the role and impact of an accurate side-chain computation in the determination and refinement of protein conformation. The results of this analysis are: (i) a lower error between computed and observed 13C(alpha) chemical shifts (by up to 3.7 ppm), was found for approximately 68% and approximately 63% of all ionizable residues and all non-Ala/Pro/Gly residues, respectively, in the additional set of conformations, compared with results for the model from which the set was derived; and (ii) all the additional conformations exhibit a lower root-mean-square-deviation (1.97 ppm < or = rmsd < or = 2.13 ppm), between computed and observed 13C(alpha) chemical shifts, than the rmsd (2.32 ppm) computed for the starting conformation from which this additional set was derived. As a validation test, an analysis of the additional set of ubiquitin conformations, comparing computed and observed values of both 13C(alpha) chemical shifts and chi(1) torsional angles (given by the vicinal coupling constants, 3J(N-Cgamma) and 3J(C'-Cgamma), is discussed.
Collapse
Affiliation(s)
- Jorge A Vila
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853-1301, USA
| | | |
Collapse
|
7
|
Abstract
Three-dimensional analysis of protein structures is proving to be one of the most fruitful modes of biological and medical discovery in the early 21st century, providing fundamental insight into many (perhaps most) biochemical functions of relevance to the cause and treatment of diseases. Fully realizing such insight, however, would require analysis of too many distinct proteins for thorough laboratory analysis of all proteins to be feasible, thus, any method capable of accurate, efficient in silico structure prediction should prove highly expeditious. The technique generally acknowledged to provide the most accurate protein structure predictions, called comparative modeling, has, thus, attracted substantial attention and is the focus of this chapter. Although other reviews have reported on the method development and research history of comparative modeling, our discussion herein focuses on the general philosophy of the method and specific strategies for successfully achieving reliable and accurate models. The chapter, thus, relates aspects of template selection, sequence alignment, spatial alignment, loop and gap modeling, side chain modeling, structural refinement, and validation.
Collapse
Affiliation(s)
- Gerald H Lushington
- Molecular Graphics and Modeling Laboratory, University of Kansas, Lawrence, KS
| |
Collapse
|
8
|
Xia J, Daly RP, Chuang FC, Parker L, Jensen JH, Margulis CJ. Sugar Folding: A Novel Structural Prediction Tool for Oligosaccharides and Polysaccharides 1. J Chem Theory Comput 2007; 3:1620-8. [DOI: 10.1021/ct700033y] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Junchao Xia
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| | - Ryan P. Daly
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| | - Feng-Chuan Chuang
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| | - Laura Parker
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| | - Jan H. Jensen
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| | - Claudio J. Margulis
- Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, and Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark
| |
Collapse
|
9
|
Heath AP, Kavraki LE, Clementi C. From coarse-grain to all-atom: Toward multiscale analysis of protein landscapes. Proteins 2007; 68:646-61. [PMID: 17523187 DOI: 10.1002/prot.21371] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Multiscale methods are becoming increasingly promising as a way to characterize the dynamics of large protein systems on biologically relevant time-scales. The underlying assumption in multiscale simulations is that it is possible to move reliably between different resolutions. We present a method that efficiently generates realistic all-atom protein structures starting from the C(alpha) atom positions, as obtained for instance from extensive coarse-grain simulations. The method, a reconstruction algorithm for coarse-grain structures (RACOGS), is validated by reconstructing ensembles of coarse-grain structures obtained during folding simulations of the proteins src-SH3 and S6. The results show that RACOGS consistently produces low energy, all-atom structures. A comparison of the free energy landscapes calculated using the coarse-grain structures versus the all-atom structures shows good correspondence and little distortion in the protein folding landscape.
Collapse
Affiliation(s)
- Allison P Heath
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | | | | |
Collapse
|
10
|
Li X, Jacobson MP, Zhu K, Zhao S, Friesner RA. Assignment of polar states for protein amino acid residues using an interaction cluster decomposition algorithm and its application to high resolution protein structure modeling. Proteins 2007; 66:824-37. [PMID: 17154422 DOI: 10.1002/prot.21125] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have developed a new method (Independent Cluster Decomposition Algorithm, ICDA) for creating all-atom models of proteins given the heavy-atom coordinates, provided by X-ray crystallography, and the pH. In our method the ionization states of titratable residues, the crystallographic mis-assignment of amide orientations in Asn/Gln, and the orientations of OH/SH groups are addressed under the unified framework of polar states assignment. To address the large number of combinatorial possibilities for the polar hydrogen states of the protein, we have devised a novel algorithm to decompose the system into independent interacting clusters, based on the observation of the crucial interdependence between the short range hydrogen bonding network and polar residue states, thus significantly reducing the computational complexity of the problem and making our algorithm tractable using relatively modest computational resources. We utilize an all atom protein force field (OPLS) and a Generalized Born continuum solvation model, in contrast to the various empirical force fields adopted in most previous studies. We have compared our prediction results with a few well-documented methods in the literature (WHATIF, REDUCE). In addition, as a preliminary attempt to couple our polar state assignment method with real structure predictions, we further validate our method using single side chain prediction, which has been demonstrated to be an effective way of validating structure prediction methods without incurring sampling problems. Comparisons of single side chain prediction results after the application of our polar state prediction method with previous results with default polar state assignments indicate a significant improvement in the single side chain predictions for polar residues.
Collapse
Affiliation(s)
- Xin Li
- Department of Chemistry, Columbia University, New York, NY 10027, USA
| | | | | | | | | |
Collapse
|
11
|
Jain T, Cerutti DS, McCammon JA. Configurational-bias sampling technique for predicting side-chain conformations in proteins. Protein Sci 2006; 15:2029-39. [PMID: 16943441 PMCID: PMC2242598 DOI: 10.1110/ps.062165906] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Prediction of side-chain conformations is an important component of several biological modeling applications. In this work, we have developed and tested an advanced Monte Carlo sampling strategy for predicting side-chain conformations. Our method is based on a cooperative rearrangement of atoms that belong to a group of neighboring side-chains. This rearrangement is accomplished by deleting groups of atoms from the side-chains in a particular region, and regrowing them with the generation of trial positions that depends on both a rotamer library and a molecular mechanics potential function. This method allows us to incorporate flexibility about the rotamers in the library and explore phase space in a continuous fashion about the primary rotamers. We have tested our algorithm on a set of 76 proteins using the all-atom AMBER99 force field and electrostatics that are governed by a distance-dependent dielectric function. When the tolerance for correct prediction of the dihedral angles is a <20 degrees deviation from the native state, our prediction accuracies for chi1 are 83.3% and for chi1 and chi2 are 65.4%. The accuracies of our predictions are comparable to the best results in the literature that often used Hamiltonians that have been specifically optimized for side-chain packing. We believe that the continuous exploration of phase space enables our method to overcome limitations inherent with using discrete rotamers as trials.
Collapse
Affiliation(s)
- Tushar Jain
- Howard Hughes Medical Institute, University of California, San Diego, CA 92093-0365, USA.
| | | | | |
Collapse
|
12
|
McDonnell AV, Menke M, Palmer N, King J, Cowen L, Berger B. Fold recognition and accurate sequence-structure alignment of sequences directing beta-sheet proteins. Proteins 2006; 63:976-85. [PMID: 16547930 DOI: 10.1002/prot.20942] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The ability to predict structure from sequence is particularly important for toxins, virulence factors, allergens, cytokines, and other proteins of public health importance. Many such functions are represented in the parallel beta-helix and beta-trefoil families. A method using pairwise beta-strand interaction probabilities coupled with evolutionary information represented by sequence profiles is developed to tackle these problems for the beta-helix and beta-trefoil folds. The algorithm BetaWrapPro employs a "wrapping" component that may capture folding processes with an initiation stage followed by processive interaction of the sequence with the already-formed motifs. BetaWrapPro outperforms all previous motif recognition programs for these folds, recognizing the beta-helix with 100% sensitivity and 99.7% specificity and the beta-trefoil with 100% sensitivity and 92.5% specificity, in crossvalidation on a database of all nonredundant known positive and negative examples of these fold classes in the PDB. It additionally aligns 88% of residues for the beta-helices and 86% for the beta-trefoils accurately (within four residues of the exact position) to the structural template, which is then used with the side-chain packing program SCWRL to produce 3D structure predictions. One striking result has been the prediction of an unexpected parallel beta-helix structure for a pollen allergen, and its recent confirmation through solution of its structure. A Web server running BetaWrapPro is available and outputs putative PDB-style coordinates for sequences predicted to form the target folds.
Collapse
Affiliation(s)
- Andrew V McDonnell
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | | | | | | | | |
Collapse
|
13
|
Abstract
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Building 12A Room 2051, 12 South Drive, Bethesda, Maryland 20892-5624, USA.
| |
Collapse
|
14
|
Zhang W, Duan Y. Grow to Fit Molecular Dynamics (G2FMD): an ab initio method for protein side-chain assignment and refinement. Protein Eng Des Sel 2006; 19:55-65. [PMID: 16401632 DOI: 10.1093/protein/gzj001] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The rough energy landscapes and tight packing of protein interiors are two of the critical factors that have prevented the wide application of physics-based models in protein side-chain assignment and protein structure prediction in general. Complementing the rotamer-based methods, we propose an ab initio method that utilizes molecular mechanics simulations for protein side-chain assignment and refinement. By reducing the side-chain size, a smooth energy landscape was obtained owing to the increased distances between the side chains. The side chains then gradually grow back during molecular dynamics simulations while adjusting to their surrounding driven by the interaction energies. The method overcomes the barriers due to tight packing that limit conformational sampling of physics-based models. A key feature of this approach is that the resulting structures are free from steric collisions and allow the application of all-atom models in the subsequent refinement. Tests on a small set of proteins showed nearly 100% accuracy on both chi1 and chi2 of buried residues and 94% of them were within 20 degrees from the native conformation, 79% were within 10 degrees and 42% were within 5 degrees . However, the accuracy decreased when exposed side chains were involved. Further improvement and application of the method and the possible reasons that affect the accuracy on the exposed side chains are discussed.
Collapse
Affiliation(s)
- Wei Zhang
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716, USA
| | | |
Collapse
|
15
|
Peterson RW, Dutton PL, Wand AJ. Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci 2004; 13:735-51. [PMID: 14978310 PMCID: PMC2286725 DOI: 10.1110/ps.03250104] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Accurate prediction of the placement and comformations of protein side chains given only the backbone trace has a wide range of uses in protein design, structure prediction, and functional analysis. Prediction has most often relied on discrete rotamer libraries so that rapid fitness of side-chain rotamers can be assessed against some scoring function. Scoring functions are generally based on experimental parameters from small-molecule studies or empirical parameters based on determined protein structures. Here, we describe the NCN algorithm for predicting the placement of side chains. A predominantly first-principles approach was taken to develop the potential energy function incorporating van der Waals and electrostatics based on the OPLS parameters, and a hydrogen bonding term. The only empirical knowledge used is the frequency of rotameric states from the PDB. The rotamer library includes nearly 50,000 rotamers, and is the most extensive discrete library used to date. Although the computational time tends to be longer than most other algorithms, the overall accuracy exceeds all algorithms in the literature when placing rotamers on an accurate backbone trace. Considering only the most buried residues, 80% of the total residues tested, the placement accuracy reaches 92% for chi(1), and 83% for chi(1 + 2), and an overall RMS deviation of 1 A. Additionally, we show that if information is available to restrict chi(1) to one rotamer well, then this algorithm can generate structures with an average RMS deviation of 1.0 A for all heavy side-chains atoms and a corresponding overall chi(1 + 2) accuracy of 85.0%.
Collapse
Affiliation(s)
- Ronald W Peterson
- The Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | | |
Collapse
|
16
|
Feig M, Karanicolas J, Brooks CL. MMTSB Tool Set: enhanced sampling and multiscale modeling methods for applications in structural biology. J Mol Graph Model 2004; 22:377-95. [PMID: 15099834 DOI: 10.1016/j.jmgm.2003.12.005] [Citation(s) in RCA: 720] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe the Multiscale Modeling Tools for Structural Biology (MMTSB) Tool Set (https://mmtsb.scripps.edu/software/mmtsbToolSet.html), which is a novel set of utilities and programming libraries that provide new enhanced sampling and multiscale modeling techniques for the simulation of proteins and nucleic acids. The tool set interfaces with the existing molecular modeling packages CHARMM and Amber for classical all-atom simulations, and with MONSSTER for lattice-based low-resolution conformational sampling. In addition, it adds new functionality for the integration and translation between both levels of detail. The replica exchange method is implemented to allow enhanced sampling of both the all-atom and low-resolution models. The tool set aims at applications in structural biology that involve protein or nucleic acid structure prediction, refinement, and/or extended conformational sampling. With structure prediction applications in mind, the tool set also implements a facility that allows the control and application of modeling tasks on a large set of conformations in what we have termed ensemble computing. Ensemble computing encompasses loosely coupled, parallel computation on high-end parallel computers, clustered computational grids and desktop grid environments. This paper describes the design and implementation of the MMTSB Tool Set and illustrates its utility with three typical examples--scoring of a set of predicted protein conformations in order to identify the most native-like structures, ab initio folding of peptides in implicit solvent with the replica exchange method, and the prediction of a missing fragment in a larger protein structure.
Collapse
Affiliation(s)
- Michael Feig
- Department of Molecular Biology, TPC6, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
17
|
Eyal E, Najmanovich R, McConkey BJ, Edelman M, Sobolev V. Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J Comput Chem 2004; 25:712-24. [PMID: 14978714 DOI: 10.1002/jcc.10420] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Contact surface area and chemical properties of atoms are used to concurrently predict conformations of multiple amino acid side chains on a fixed protein backbone. The combination of surface complementarity and solvent-accessible surface accounts for van der Waals forces and solvation free energy. The scoring function is particularly suitable for modeling partially buried side chains. Both iterative and stochastic searching approaches are used. Our programs (Sccomp-I and Sccomp-S), with relatively fast execution times, correctly predict chi1 angles for 92-93% of buried residues and 82-84% for all residues, with an RMSD of approximately 1.7 A for side chain heavy atoms. We find that the differential between the atomic solvation parameters and the contact surface parameters (including those between noncomplementary atoms) is positive; i.e., most protein atoms prefer surface contact with other protein atoms rather than with the solvent. This might correspond to the driving force for maximizing packing of the protein. The influence of the crystal packing, completeness of rotamer library and precise positioning of Cbeta atoms on the accuracy of side-chain prediction are examined. The Sccomp-S and Sccomp-I programs can be accessed through the Web (http://sgedg.weizmann.ac.il/sccomp.html) and are available for several platforms.
Collapse
Affiliation(s)
- Eran Eyal
- Department of Plant Sciences, Weizmann Institute of Science, 76100, Rehovot, Israel
| | | | | | | | | |
Collapse
|
18
|
Kaźmierkiewicz R, Liwo A, Scheraga HA. Addition of side chains to a known backbone with defined side-chain centroids. Biophys Chem 2003; 100:261-80. [PMID: 12646370 DOI: 10.1016/s0301-4622(02)00285-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
An automatic procedure is proposed for adding side chains to a protein backbone; it is based on optimization of a simplified energy function for peptide side chains, given its backbone and positions of side-chain centroids. The energy is expressed as a sum of the energies of interaction between side chains, and a harmonic penalty function accounting for the preservation of the positions of the C(alpha) atoms and the side-chain centroids. The energy of side-chain interactions is calculated with the soft-sphere ECEPP/3 potential. A Monte Carlo search is carried out to explore all possible side-chain orientations within a fixed backbone and side-chain centroid positions. The initial, usually extended, side-chain conformations are taken directly from the ECEPP/3 database. The procedure was tested on six experimental (X-ray or NMR) structures: immunoglobulin binding protein (PDB code 1IGD, an alpha+beta-protein); transcription factor PML (PDB code 1BOR, a 49-104 fragment of the ring finger domain, predominantly beta-protein); bovine pancreatic trypsin inhibitor (crystal form II) (PDB code 1BPI, an alpha+beta-protein); the monomer of human deoxyhemoglobin (PDB code 1BZ0, an alpha-helical structure); chain A of alcohol dehydrogenase from Drosophila lebanonensis (PDB code 1A4U); as well as on the 10-55 portion of the B domain of staphylococcal protein A (PDB code 1BDD). In all cases except 1BPI, the data for the algorithm (i.e. the backbone or C(alpha) coordinates and the positions of side-chain centroids) were taken from the experimental structures. For protein A, the C(alpha) coordinates and positions of side-chain centroids were also taken from the 1.9-A-resolution model predicted by the UNRES force field. In all comparisons with experimental structures, complete side-chain geometry was reconstructed with a root-mean-square (RMS) deviation of approximately 0.6-0.9 A from the heavy atoms when complete backbone and side-chain-centroid coordinates were used in reconstruction, or approximately 1.0 A when the C(alpha) and centroid coordinates were used.
Collapse
Affiliation(s)
- Rajmund Kaźmierkiewicz
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | | | | |
Collapse
|
19
|
Desmet J, Spriet J, Lasters I. Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 2002; 48:31-43. [PMID: 12012335 DOI: 10.1002/prot.10131] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We have developed an original method for global optimization of protein side-chain conformations, called the Fast and Accurate Side-Chain Topology and Energy Refinement (FASTER) method. The method operates by systematically overcoming local minima of increasing order. Comparison of the FASTER results with those of the dead-end elimination (DEE) algorithm showed that both methods produce nearly identical results, but the FASTER algorithm is 100-1000 times faster than the DEE method and scales in a stable and favorable way as a function of protein size. We also show that low-order local minima may be almost as accurate as the global minimum when evaluated against experimentally determined structures. In addition, the new algorithm provides significant information about the conformational flexibility of individual side-chains. We observed that strictly rigid side-chains are concentrated mainly in the core of the protein, whereas highly flexible side-chains are found almost exclusively among solvent-oriented residues.
Collapse
|
20
|
Abstract
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.
Collapse
Affiliation(s)
- Shide Liang
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | | |
Collapse
|
21
|
Glick M, Rayan A, Goldblum A. A stochastic algorithm for global optimization and for best populations: a test case of side chains in proteins. Proc Natl Acad Sci U S A 2002; 99:703-8. [PMID: 11792838 PMCID: PMC117369 DOI: 10.1073/pnas.022418199] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The problem of global optimization is pivotal in a variety of scientific fields. Here, we present a robust stochastic search method that is able to find the global minimum for a given cost function, as well as, in most cases, any number of best solutions for very large combinatorial "explosive" systems. The algorithm iteratively eliminates variable values that contribute consistently to the highest end of a cost function's spectrum of values for the full system. Values that have not been eliminated are retained for a full, exhaustive search, allowing the creation of an ordered population of best solutions, which includes the global minimum. We demonstrate the ability of the algorithm to explore the conformational space of side chains in eight proteins, with 54 to 263 residues, to reproduce a population of their low energy conformations. The 1,000 lowest energy solutions are identical in the stochastic (with two different seed numbers) and full, exhaustive searches for six of eight proteins. The others retain the lowest 141 and 213 (of 1,000) conformations, depending on the seed number, and the maximal difference between stochastic and exhaustive is only about 0.15 Kcal/mol. The energy gap between the lowest and highest of the 1,000 low-energy conformers in eight proteins is between 0.55 and 3.64 Kcal/mol. This algorithm offers real opportunities for solving problems of high complexity in structural biology and in other fields of science and technology.
Collapse
Affiliation(s)
- Meir Glick
- Department of Medicinal Chemistry and the David R. Bloom Center for Pharmacy, School of Pharmacy, Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | | | | |
Collapse
|
22
|
Affiliation(s)
- J G Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
23
|
Mendes J, Nagarajaram HA, Soares CM, Blundell TL, Carrondo MA. Incorporating knowledge-based biases into an energy-based side-chain modeling method: application to comparative modeling of protein structure. Biopolymers 2001; 59:72-86. [PMID: 11373721 DOI: 10.1002/1097-0282(200108)59:2<72::aid-bip1007>3.0.co;2-s] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The performance of the self-consistent mean field theory (SCMFT) method for side-chain modeling, employing rotamer energies calculated with the flexible rotamer model (FRM), is evaluated in the context of comparative modeling of protein structure. Predictions were carried out on a test set of 56 model backbones of varying accuracy, to allow side-chain prediction accuracy to be analyzed as a function of backbone accuracy. A progressive decrease in the accuracy of prediction was observed as backbone accuracy decreased. However, even for very low backbone accuracy, prediction was substantially higher than random, indicating that the FRM can, in part, compensate for the errors in the modeled tertiary environment. It was also investigated whether the introduction in the FRM-SCMFT method of knowledge-based biases, derived from a backbone-dependent rotamer library, could enhance its performance. A bias derived from the backbone-dependent rotamer conformations alone did not improve prediction accuracy. However, a bias derived from the backbone-dependent rotamer probabilities improved prediction accuracy considerably. This bias was incorporated through two different strategies. In one (the indirect strategy), rotamer probabilities were used to reject unlikely rotamers a priori, thus restricting prediction by FRM-SCMFT to a subset containing only the most probable rotamers in the library. In the other (the direct strategy), rotamer energies were transformed into pseudo-energies that were added to the average potential energies of the respective rotamers, thereby creating hybrid energy-based/knowledge-based average rotamer energies, which were used by the FRM-SCMFT method for prediction. For all degrees of backbone accuracy, an optimal strength of the knowledge-based bias existed for both strategies for which predictions were more accurate than pure energy-based predictions, and also than pure knowledge-based predictions. Hybrid knowledge-based/energy-based methods were obtained from both strategies and compared with the SCWRL method, a hybrid method based on the same backbone-dependent rotamer library. The accuracy of the indirect method was approximately the same as that of the SCWRL method, but that of the direct method was significantly higher.
Collapse
Affiliation(s)
- J Mendes
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Apartado 127, Av. da República, 2781-901, Oeiras, Portugal
| | | | | | | | | |
Collapse
|
24
|
Abstract
Current techniques for the prediction of side-chain conformations on a fixed backbone have an accuracy limit of about 1.0-1.5 A rmsd for core residues. We have carried out a detailed and systematic analysis of the factors that influence the prediction of side-chain conformation and, on this basis, have succeeded in extending the limits of side-chain prediction for core residues to about 0.7 A rmsd from native, and 94 % and 89 % of chi(1) and chi(1+2 ) dihedral angles correctly predicted to within 20 degrees of native, respectively. These results are obtained using a force-field that accounts for only van der Waals interactions and torsional potentials. Prediction accuracy is strongly dependent on the rotamer library used. That is, a complete and detailed rotamer library is essential. The greatest accuracy was obtained with an extensive rotamer library, containing over 7560 members, in which bond lengths and bond angles were taken from the database rather than simply assuming idealized values. Perhaps the most surprising finding is that the combinatorial problem normally associated with the prediction of the side-chain conformation does not appear to be important. This conclusion is based on the fact that the prediction of the conformation of a single side-chain with all others fixed in their native conformations is only slightly more accurate than the simultaneous prediction of all side-chain dihedral angles.
Collapse
Affiliation(s)
- Z Xiang
- Department of Biochemistry and Molecular Biophysics BB221, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
25
|
Martí-Renom MA, Stuart AC, Fiser A, Sánchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE 2001; 29:291-325. [PMID: 10940251 DOI: 10.1146/annurev.biophys.29.1.291] [Citation(s) in RCA: 2337] [Impact Index Per Article: 101.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Comparative modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. The number of protein sequences that can be modeled and the accuracy of the predictions are increasing steadily because of the growth in the number of known protein structures and because of the improvements in the modeling software. Further advances are necessary in recognizing weak sequence-structure similarities, aligning sequences with structures, modeling of rigid body shifts, distortions, loops and side chains, as well as detecting errors in a model. Despite these problems, it is currently possible to model with useful accuracy significant parts of approximately one third of all known protein sequences. The use of individual comparative models in biology is already rewarding and increasingly widespread. A major new challenge for comparative modeling is the integration of it with the torrents of data from genome sequencing projects as well as from functional and structural genomics. In particular, there is a need to develop an automated, rapid, robust, sensitive, and accurate comparative modeling pipeline applicable to whole genomes. Such large-scale modeling is likely to encourage new kinds of applications for the many resulting models, based on their large number and completeness at the level of the family, organism, or functional network.
Collapse
Affiliation(s)
- M A Martí-Renom
- Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, Rockefeller University, New York, NY 10021, USA
| | | | | | | | | | | |
Collapse
|
26
|
Kono H, Saven JG. Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure. J Mol Biol 2001; 306:607-28. [PMID: 11178917 DOI: 10.1006/jmbi.2000.4422] [Citation(s) in RCA: 107] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.
Collapse
Affiliation(s)
- H Kono
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | |
Collapse
|
27
|
Abstract
The prediction of the three-dimensional structures of the native states of proteins from the sequences of their amino acids is one of the most important challenges in molecular biology. An essential task for solving this problem within coarse-grained models is the deduction of effective interaction potentials between the amino acids. Over the years, several techniques have been developed to extract potentials that are able to discriminate satisfactorily between the native and nonnative folds of a preassigned protein sequence. In general, when these potentials are used in actual dynamical folding simulations, they lead to a drift of the native structure outside the quasinative basin. In this article, we present and validate an approach to overcome this difficulty. By exploiting several numerical and analytical tools, we set up a rigorous iterative scheme to extract potentials satisfying a prerequisite of any viable potential: the stabilization of proteins within their native basin (less than 3-4 A RMSD). The scheme is flexible and is demonstrated to be applicable to a variety of parameterizations of the energy function, and it provides in each case the optimal potentials.
Collapse
Affiliation(s)
- C Micheletti
- International School for Advanced Studies and INFM, Trieste, Italy.
| | | | | | | |
Collapse
|
28
|
Abstract
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.
Collapse
Affiliation(s)
- B Al-Lazikani
- Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Columbia University, 630 West 168th Street, New York, NY 10032, USA
| | | | | | | |
Collapse
|
29
|
Rossi A, Micheletti C, Seno F, Maritan A. A self-consistent knowledge-based approach to protein design. Biophys J 2001; 80:480-90. [PMID: 11159418 PMCID: PMC1301249 DOI: 10.1016/s0006-3495(01)76030-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A simple and very efficient protein design strategy is proposed by developing some recently introduced theoretical tools which have been successfully applied to exactly solvable protein models. The design approach is implemented by using three amino acid classes and it is based on the minimization of an appropriate energy function. For a given native state the results of the design procedure are compared, through a statistical analysis, with the properties of an ensemble of sequences folding in the same conformation. If the success rate is computed on those sites designed with high confidence, it can be as high as 80%. The method is also able to identify key sites for the folding process: results for 2ci2 and barnase are in very good agreement with experimental results.
Collapse
Affiliation(s)
- A Rossi
- International School for Advanced Studies and INFM, I-34014 Trieste, Italy.
| | | | | | | |
Collapse
|
30
|
Micheletti C, Seno F, Maritan A. Recurrent oligomers in proteins: an optimal scheme reconciling accurate and concise backbone representations in automated folding and design studies. Proteins 2000; 40:662-74. [PMID: 10899788 DOI: 10.1002/1097-0134(20000901)40:4<662::aid-prot90>3.0.co;2-f] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A novel scheme is introduced to capture the spatial correlations of consecutive amino acids in naturally occurring proteins. This knowledge-based strategy is able to carry out optimally automated subdivisions of protein fragments into classes of similarity. The goal is to provide the minimal set of protein oligomers (termed "oligons" for brevity) that is able to represent any other fragment. At variance with previous studies in which recurrent local motifs were classified, our concern is to provide simplified protein representations that have been optimised for use in automated folding and/or design attempts. In such contexts, it is paramount to limit the number of degrees of freedom per amino acid without incurring loss of accuracy of structural representations. The suggested method finds, by construction, the optimal compromise between these needs. Several possible oligon lengths are considered. It is shown that meaningful classifications cannot be done for lengths greater than six or smaller than four. Different contexts are considered for which oligons of length five or six are recommendable. With only a few dozen oligons of such length, virtually any protein can be reproduced within typical experimental uncertainties. Structural data for the oligons are made publicly available.
Collapse
Affiliation(s)
- C Micheletti
- International School for Advanced Studies and INFM, and the Abdus Salam International Centre for Theoretical Physics, Trieste, Italy.
| | | | | |
Collapse
|
31
|
Abstract
Ligand binding may involve a wide range of structural changes in the receptor protein, from hinge movement of entire domains to small side-chain rearrangements in the binding pocket residues. The analysis of side chain flexibility gives insights valuable to improve docking algorithms and can provide an index of amino-acid side-chain flexibility potentially useful in molecular biology and protein engineering studies. In this study we analyzed side-chain rearrangements upon ligand binding. We constructed two non-redundant databases (980 and 353 entries) of "paired" protein structures in complexed (holo-protein) and uncomplexed (apo-protein) forms from the PDB macromolecular structural database. The number and identity of binding pocket residues that undergo side-chain conformational changes were determined. We show that, in general, only a small number of residues in the pocket undergo such changes (e.g., approximately 85% of cases show changes in three residues or less). The flexibility scale has the following order: Lys > Arg, Gln, Met > Glu, Ile, Leu > Asn, Thr, Val, Tyr, Ser, His, Asp > Cys, Trp, Phe; thus, Lys side chains in binding pockets flex 25 times more often then do the Phe side chains. Normalizing for the number of flexible dihedral bonds in each amino acid attenuates the scale somewhat, however, the clear trend of large, polar amino acids being more flexible in the pocket than aromatic ones remains. We found no correlation between backbone movement of a residue upon ligand binding and the flexibility of its side chain. These results are relevant to 1. Reduction of search space in docking algorithms by inclusion of side-chain flexibility for a limited number of binding pocket residues; and 2. Utilization of the amino acid flexibility scale in protein engineering studies to alter the flexibility of binding pockets.
Collapse
Affiliation(s)
- R Najmanovich
- Plant Sciences Department, Weizmann Institute of Science, Rehovot, Israel.
| | | | | | | |
Collapse
|
32
|
Lemak AS, Gunn JR. Rotamer-Specific Potentials of Mean Force for Residue Pair Interactions. J Phys Chem B 2000. [DOI: 10.1021/jp9919157] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Alexandre S. Lemak
- Départment de Chimie, Centre de Recherche en Calcul Appliqué, and Protein Engineering Network of Centers of Excellence, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada
| | - John R. Gunn
- Départment de Chimie, Centre de Recherche en Calcul Appliqué, and Protein Engineering Network of Centers of Excellence, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec H3C 3J7, Canada
| |
Collapse
|
33
|
Huang ES, Samudrala R, Ponder JW. Distance geometry generates native-like folds for small helical proteins using the consensus distances of predicted protein structures. Protein Sci 1998; 7:1998-2003. [PMID: 9761481 PMCID: PMC2144160 DOI: 10.1002/pro.5560070916] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
For successful ab initio protein structure prediction, a method is needed to identify native-like structures from a set containing both native and non-native protein-like conformations. In this regard, the use of distance geometry has shown promise when accurate inter-residue distances are available. We describe a method by which distance geometry restraints are culled from sets of 500 protein-like conformations for four small helical proteins generated by the method of Simons et al. (1997). A consensus-based approach was applied in which every inter-Calpha distance was measured, and the most frequently occurring distances were used as input restraints for distance geometry. For each protein, a structure with lower coordinate root-mean-square (RMS) error than the mean of the original set was constructed; in three cases the topology of the fold resembled that of the native protein. When the fold sets were filtered for the best scoring conformations with respect to an all-atom knowledge-based scoring function, the remaining subset of 50 structures yielded restraints of higher accuracy. A second round of distance geometry using these restraints resulted in an average coordinate RMS error of 4.38 A.
Collapse
Affiliation(s)
- E S Huang
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Saint Louis, Missouri 63110, USA
| | | | | |
Collapse
|