1
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
2
|
Xu G, Ma T, Du J, Wang Q, Ma J. OPUS-Rota2: An Improved Fast and Accurate Side-Chain Modeling Method. J Chem Theory Comput 2019; 15:5154-5160. [PMID: 31412199 DOI: 10.1021/acs.jctc.9b00309] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Side-chain modeling plays a critical role in protein structure prediction. However, in many current methods, balancing the speed and accuracy is still challenging. In this paper, on the basis of our previous work OPUS-Rota (Protein Sci. 2008, 17, 1576-1585), we introduce a new side-chain modeling method, OPUS-Rota2, which is tested on both a 65-protein test set (DB65) in the OPUS-Rota paper and a 379-protein test set (DB379) in the SCWRL4 paper. If the main chain is native, OPUS-Rota2 is more accurate than OPUS-Rota, SCWRL4, and OSCAR-star but slightly less accurate than OSCAR-o. Also, if the main chain is non-native, OPUS-Rota2 is more accurate than any other method. Moreover, OPUS-Rota2 is significantly faster than any other method, in particular, 2 orders of magnitude faster than OSCAR-o. Thus, the combination of higher accuracy and speed of OPUS-Rota2 in modeling side chains on both the native and non-native main chains makes OPUS-Rota2 a very useful tool in protein structure modeling.
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China
| | | | - Junqing Du
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China.,Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States.,School of Life Sciences , Fudan University , Shanghai 200433 , China
| |
Collapse
|
3
|
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA. Protein side-chain packing problem: is there still room for improvement? Brief Bioinform 2018; 18:1033-1043. [PMID: 27567382 DOI: 10.1093/bib/bbw079] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Indexed: 11/12/2022] Open
Abstract
The protein side-chain packing problem (PSCPP) is an important subproblem of both protein structure prediction and protein design. During the past two decades, a large number of methods have been proposed to tackle this problem. These methods consist of three main components: a rotamer library, a scoring function and a search strategy. The average overall accuracy level obtained by these methods is approximately 87%. Whether a better accuracy level could be achieved remains to be answered. To address this question, we calculated the maximum accuracy level attainable using a simple rotamer library, independently of the energy function or the search method. Using 2883 different structures from the Protein Data Bank, we compared this accuracy level with the accuracy level of five state-of-the-art methods. These comparisons indicated that, for buried residues in the protein, we are already close to the best possible accuracy results. In addition, for exposed residues, we found that a significant gap exists between the possible improvement and the maximum accuracy level achievable with current methods. After determining that an improvement is possible, the next step is to understand what limitations are preventing us from obtaining such an improvement. Previous works on protein structure prediction and protein design have shown that scoring function inaccuracies may represent the main obstacle to achieving better results for these problems. To show that the same is true for the PSCPP, we evaluated the quality of two scoring functions used by some state-of-the-art algorithms. Our results indicate that neither of these scoring functions can guide the search method correctly, thereby reinforcing the idea that efforts to solve the PSCPP must also focus on developing better scoring functions.
Collapse
|
4
|
Colbes J, Aguila SA, Brizuela CA. Scoring of Side-Chain Packings: An Analysis of Weight Factors and Molecular Dynamics Structures. J Chem Inf Model 2018; 58:443-452. [PMID: 29368924 DOI: 10.1021/acs.jcim.7b00679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The protein side-chain packing problem (PSCPP) is a central task in computational protein design. The problem is usually modeled as a combinatorial optimization problem, which consists of searching for a set of rotamers, from a given rotamer library, that minimizes a scoring function (SF). The SF is a weighted sum of terms, that can be decomposed in physics-based and knowledge-based terms. Although there are many methods to obtain approximate solutions for this problem, all of them have similar performances and there has not been a significant improvement in recent years. Studies on protein structure prediction and protein design revealed the limitations of current SFs to achieve further improvements for these two problems. In the same line, a recent work reported a similar result for the PSCPP. In this work, we ask whether or not this negative result regarding further improvements in performance is due to (i) an incorrect weighting of the SFs terms or (ii) the constrained conformation resulting from the protein crystallization process. To analyze these questions, we (i) model the PSCPP as a bi-objective combinatorial optimization problem, optimizing, at the same time, the two most important terms of two SFs of state-of-the-art algorithms and (ii) performed a preprocessing relaxation of the crystal structure through molecular dynamics to simulate the protein in the solvent and evaluated the performance of these two state-of-the-art SFs under these conditions. Our results indicate that (i) no matter what combination of weight factors we use the current SFs will not lead to better performances and (ii) the evaluated SFs will not be able to improve performance on relaxed structures. Furthermore, the experiments revealed that the SFs and the methods are biased toward crystallized structures.
Collapse
Affiliation(s)
- Jose Colbes
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnologia, Universidad Nacional Autonoma de Mexico , Km. 107 Carretera Tijuana-Ensenada, Ensenada, Baja California, Mexico , C.P. 22860
| | - Carlos A Brizuela
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| |
Collapse
|
5
|
Shimizu M, Takada S. Reconstruction of Atomistic Structures from Coarse-Grained Models for Protein-DNA Complexes. J Chem Theory Comput 2018; 14:1682-1694. [PMID: 29397721 DOI: 10.1021/acs.jctc.7b00954] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
While coarse-grained (CG) simulations have widely been used to accelerate structure sampling of large biomolecular complexes, they are unavoidably less accurate and thus the reconstruction of all-atom (AA) structures and the subsequent refinement is desirable. In this study we developed an efficient method to reconstruct AA structures from sampled CG protein-DNA complex models, which attempts to model the protein-DNA interface accurately. First we developed a method to reconstruct atomic details of DNA structures from a three-site per nucleotide CG model, which uses a DNA fragment library. Next, for the protein-DNA interface, we referred to the side chain orientations in the known structure of the target interface when available. The other parts are modeled by existing tools. We confirmed the accuracy of the protocol in various aspects including the structure deviation in the self-reproduction, the base pair reproducibility, atomic contacts at the protein-DNA interface, and feasibility of the posterior AA simulations.
Collapse
Affiliation(s)
- Masahiro Shimizu
- Department of Biophysics, Graduate School of Science , Kyoto University , Sakyo, Kyoto 606-8502 Japan
| | - Shoji Takada
- Department of Biophysics, Graduate School of Science , Kyoto University , Sakyo, Kyoto 606-8502 Japan
| |
Collapse
|
6
|
Incorporation of side chain flexibility into protein binding pockets using MTflex. Bioorg Med Chem 2016; 24:4978-4987. [DOI: 10.1016/j.bmc.2016.08.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 08/16/2016] [Accepted: 08/18/2016] [Indexed: 01/15/2023]
|
7
|
Purvine E, Monson K, Jurrus E, Star K, Baker NA. Energy Minimization of Discrete Protein Titration State Models Using Graph Theory. J Phys Chem B 2016; 120:8354-60. [PMID: 27089174 DOI: 10.1021/acs.jpcb.6b02059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There are several applications in computational biophysics that require the optimization of discrete interacting states, for example, amino acid titration states, ligand oxidation states, or discrete rotamer angles. Such optimization can be very time-consuming as it scales exponentially in the number of sites to be optimized. In this paper, we describe a new polynomial time algorithm for optimization of discrete states in macromolecular systems. This algorithm was adapted from image processing and uses techniques from discrete mathematics and graph theory to restate the optimization problem in terms of "maximum flow-minimum cut" graph analysis. The interaction energy graph, a graph in which vertices (amino acids) and edges (interactions) are weighted with their respective energies, is transformed into a flow network in which the value of the minimum cut in the network equals the minimum free energy of the protein and the cut itself encodes the state that achieves the minimum free energy. Because of its deterministic nature and polynomial time performance, this algorithm has the potential to allow for the ionization state of larger proteins to be discovered.
Collapse
Affiliation(s)
| | | | | | | | - Nathan A Baker
- Division of Applied Mathematics, Brown University , Providence, Rhode Island 02912, United States
| |
Collapse
|
8
|
Tak Kam VW, Goddard WA. Flat-Bottom Strategy for Improved Accuracy in Protein Side-Chain Placements. J Chem Theory Comput 2015; 4:2160-9. [PMID: 26620487 DOI: 10.1021/ct800196k] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a new strategy for protein side-chain placement that uses flat-bottom potentials for rotamer scoring. The extent of the flat bottom depends on the coarseness of the rotamer library and is optimized for libraries ranging from diversities of 0.2 Å to 5.0 Å. The parameters reported here were optimized for forcefields using Lennard-Jones 12-6 van der Waals potential with DREIDING parameters but are expected to be similar for AMBER, CHARMM, and other forcefields. This Side-Chain Rotamer Excitation Analysis Method is implemented in the SCREAM software package. Similar scoring function strategies should be useful for ligand docking, virtual ligand screening, and protein folding applications.
Collapse
Affiliation(s)
- Victor Wai Tak Kam
- Materials and Process Simulation Center (MC-139-74), California Institute of Technology, Pasadena, California 91125
| | - William A Goddard
- Materials and Process Simulation Center (MC-139-74), California Institute of Technology, Pasadena, California 91125
| |
Collapse
|
9
|
Longo LM, Blaber M. Symmetric protein architecture in protein design: top-down symmetric deconstruction. Methods Mol Biol 2014; 1216:161-182. [PMID: 25213415 DOI: 10.1007/978-1-4939-1486-9_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Top-down symmetric deconstruction (TDSD) is a joint experimental and computational approach to generate a highly stable, functionally benign protein scaffold for intended application in subsequent functional design studies. By focusing on symmetric protein folds, TDSD can leverage the dramatic reduction in sequence space achieved by applying a primary structure symmetric constraint to the design process. Fundamentally, TDSD is an iterative symmetrization process, in which the goal is to maintain or improve properties of thermodynamic stability and folding cooperativity inherent to a starting sequence (the "proxy"). As such, TDSD does not attempt to solve the inverse protein folding problem directly, which is computationally intractable. The present chapter will take the reader through all of the primary steps of TDSD-selecting a proxy, identifying potential mutations, establishing a stability/folding cooperativity screen-relying heavily on a successful TDSD solution for the common β-trefoil fold.
Collapse
Affiliation(s)
- Liam M Longo
- Department of Biomedical Sciences, College of Medicine, Florida State University, 1115 West Call Street, Tallahassee, FL, 32306-4300, USA
| | | |
Collapse
|
10
|
Incorporation of noncanonical amino acids into Rosetta and use in computational protein-peptide interface design. PLoS One 2012; 7:e32637. [PMID: 22431978 PMCID: PMC3303795 DOI: 10.1371/journal.pone.0032637] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 01/28/2012] [Indexed: 11/19/2022] Open
Abstract
Noncanonical amino acids (NCAAs) can be used in a variety of protein design contexts. For example, they can be used in place of the canonical amino acids (CAAs) to improve the biophysical properties of peptides that target protein interfaces. We describe the incorporation of 114 NCAAs into the protein-modeling suite Rosetta. We describe our methods for building backbone dependent rotamer libraries and the parameterization and construction of a scoring function that can be used to score NCAA containing peptides and proteins. We validate these additions to Rosetta and our NCAA-rotamer libraries by showing that we can improve the binding of a calpastatin derived peptides to calpain-1 by substituting NCAAs for native amino acids using Rosetta. Rosetta (executables and source), auxiliary scripts and code, and documentation can be found at (http://www.rosettacommons.org/).
Collapse
|
11
|
Abstract
MOTIVATION Modeling of side chain conformations constitutes an indispensable effort in protein structure modeling, protein-protein docking and protein design. Thanks to an intensive attention to this field, many of the existing programs can achieve reasonably good and comparable prediction accuracy. Moreover, in our previous work on CIS-RR, we argued that the prediction with few atomic clashes can complement the current existing methods for subsequent analysis and refinement of protein structures. However, these recent efforts to enhance the quality of predicted side chains have been accompanied by a significant increase of computational cost. RESULTS In this study, by mainly focusing on improving the speed of side chain conformation prediction, we present a RApid Side-chain Predictor, called RASP. To achieve a much faster speed with a comparable accuracy to the best existing methods, we not only employ the clash elimination strategy of CIS-RR, but also carefully optimize energy terms and integrate different search algorithms. In comprehensive benchmark testings, RASP is over one order of magnitude faster (~ 40 times over CIS-RR) than the recently developed methods, while achieving comparable or even better accuracy.
Collapse
Affiliation(s)
- Zhichao Miao
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | | | | |
Collapse
|
12
|
Samish I, MacDermaid CM, Perez-Aguilar JM, Saven JG. Theoretical and Computational Protein Design. Annu Rev Phys Chem 2011; 62:129-49. [DOI: 10.1146/annurev-physchem-032210-103509] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | | | - Jeffery G. Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104;
| |
Collapse
|
13
|
Cao Y, Song L, Miao Z, Hu Y, Tian L, Jiang T. Improved side-chain modeling by coupling clash-detection guided iterative search with rotamer relaxation. ACTA ACUST UNITED AC 2011; 27:785-90. [PMID: 21216772 DOI: 10.1093/bioinformatics/btr009] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Side-chain modeling has seen wide applications in computational structure biology. Most of the popular side-chain modeling programs explore the conformation space using discrete rigid rotamers for speed and efficiency. However, in the tightly packed environments of protein interiors, these methods will inherently lead to atomic clashes and hinder the prediction accuracy. RESULTS We present a side-chain modeling method (CIS-RR), which couples a novel clash-detection guided iterative search (CIS) algorithm with continuous torsion space optimization of rotamers (RR). Benchmark testing shows that compared with the existing popular side-chain modeling methods, CIS-RR removes atomic clashes much more effectively and achieves comparable or even better prediction accuracy while having comparable computational cost. We believe that CIS-RR could be a useful method for accurate side-chain modeling. AVAILABILITY CIS-RR is available to non-commercial users at our website: http://jianglab.ibp.ac.cn/lims/cisrr/cisrr.html.
Collapse
Affiliation(s)
- Yang Cao
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | |
Collapse
|
14
|
Harder T, Boomsma W, Paluszewski M, Frellsen J, Johansson KE, Hamelryck T. Beyond rotamers: a generative, probabilistic model of side chains in proteins. BMC Bioinformatics 2010; 11:306. [PMID: 20525384 PMCID: PMC2902450 DOI: 10.1186/1471-2105-11-306] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2010] [Accepted: 06/05/2010] [Indexed: 11/21/2022] Open
Abstract
Background Accurately covering the conformational space of amino acid side chains is essential for important applications such as protein design, docking and high resolution structure prediction. Today, the most common way to capture this conformational space is through rotamer libraries - discrete collections of side chain conformations derived from experimentally determined protein structures. The discretization can be exploited to efficiently search the conformational space. However, discretizing this naturally continuous space comes at the cost of losing detailed information that is crucial for certain applications. For example, rigorously combining rotamers with physical force fields is associated with numerous problems. Results In this work we present BASILISK: a generative, probabilistic model of the conformational space of side chains that makes it possible to sample in continuous space. In addition, sampling can be conditional upon the protein's detailed backbone conformation, again in continuous space - without involving discretization. Conclusions A careful analysis of the model and a comparison with various rotamer libraries indicates that the model forms an excellent, fully continuous model of side chain conformational space. We also illustrate how the model can be used for rigorous, unbiased sampling with a physical force field, and how it improves side chain prediction when used as a pseudo-energy term. In conclusion, BASILISK is an important step forward on the way to a rigorous probabilistic description of protein structure in continuous space and in atomic detail.
Collapse
Affiliation(s)
- Tim Harder
- The Bioinformatics Section, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | | | | | | | | | |
Collapse
|
15
|
Ma J. Explicit orientation dependence in empirical potentials and its significance to side-chain modeling. Acc Chem Res 2009; 42:1087-96. [PMID: 19445451 DOI: 10.1021/ar900009e] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein structure modeling and prediction have important applications throughout the biological sciences, from the design of pharmaceuticals to the elucidation of enzyme mechanisms. At the core of most protein modeling is an energy function, the minimum of which represents the free energy "cost" for forming a correct protein structure. The most commonly used energy functions are knowledge-based statistical potential functions; that is, they are empirically derived from statistical analysis of a set of high-resolution protein structures. When that kind of potential function is constructed, the anisotropic orientation dependence between the interacting groups is a critical component for accurately representing key molecular interactions, such as those involved in protein side-chain packing. In the literature, however, many potential functions are limited in their ability to describe orientation dependence. In all-atom potentials, they typically ignore heterogeneous chemical-bond connectivity. In coarse-grained potentials, such as (semi)-residue-based potentials, the simplified representation of residues often reduces the sensitivity of the potential to side-chain orientation. Recently, in an effort to maximally capture the orientation dependence in side-chain interactions, a new type of all-atom statistical potential was developed: OPUS-PSP (potential derived from side-chain packing). The key feature of this potential is its explicit description of orientation dependence in molecular interactions, which is achieved with a basis set of 19 rigid-body blocks extracted from the chemical structures of 20 amino acid residues. This basis set is specifically designed to maximally capture the essential elements of orientation dependence in molecular packing interactions. The potential is constructed from the orientation-specific packing statistics of pairs of those blocks in a nonredundant structural database. On decoy set tests, OPUS-PSP significantly outperforms most of the existing knowledge-based potentials in terms of both its ability to recognize native structures and its consistency in achieving high Z scores across decoy sets. The application of OPUS-PSP to conformational modeling of side chains has led to another method, called OPUS-Rota. In terms of combined speed and accuracy, OPUS-Rota outperforms all of the other methods in modeling side-chain conformation. In this Account, we briefly outline the basic scheme of the OPUS-PSP potential and its application to side-chain modeling via OPUS-Rota. Future perspectives on the modeling of orientation dependence are also discussed. The computer programs for OPUS-PSP and OPUS-Rota can be downloaded at http://sigler.bioch.bcm.tmc.edu/MaLab . They are free for academic users.
Collapse
Affiliation(s)
- Jianpeng Ma
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, and Department of Bioengineering, Rice University, Houston, Texas 77005
| |
Collapse
|
16
|
Pupo A, Moreno E. Do rotamer libraries reproduce the side-chain conformations of peptidic ligands from the PDB? J Mol Graph Model 2008; 27:611-9. [PMID: 19028123 DOI: 10.1016/j.jmgm.2008.10.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2008] [Revised: 10/08/2008] [Accepted: 10/09/2008] [Indexed: 10/21/2022]
Abstract
Rotamer libraries are collections of side-chain conformations compiled for each type of amino acid. All the libraries developed so far were constructed based on different sets of proteins, none of which includes small peptidic molecules. Are the existing rotamer libraries suitable for modeling the side chains of peptidic ligands in complex with their receptors? To answer this question, we have tested 10 different, publicly available and commonly used rotamer libraries for their capability of reproducing the side-chain conformations (and therefore the receptor-ligand interactions) of hundreds of short peptidic ligands found in the Protein Data Bank, including large sets of class I and class II T-cell epitopes. Only the libraries developed by Xiang and Honig were able to correctly reproduce the experimental geometries for most of the analyzed residues, and the atomic interactions between the peptidic ligands and their receptors. Surprisingly, all the libraries showed a lower performance in reproducing the side chains conformations from structures solved at very high resolution (R<1.25A).
Collapse
Affiliation(s)
- Amaury Pupo
- Department of Systems Biology, Center of Molecular Immunology, Havana 11600, Cuba.
| | | |
Collapse
|
17
|
Lu M, Dousis AD, Ma J. OPUS-Rota: a fast and accurate method for side-chain modeling. Protein Sci 2008; 17:1576-85. [PMID: 18556476 DOI: 10.1110/ps.035022.108] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
In this paper, we introduce a fast and accurate side-chain modeling method, named OPUS-Rota. In a benchmark comparison with the methods SCWRL, NCN, LGA, SPRUCE, Rosetta, and SCAP, OPUS-Rota is shown to be much faster than all the methods except SCWRL, which is comparably fast. In terms of overall chi (1) and chi (1+2) accuracies, however, OPUS-Rota is 5.4 and 8.8 percentage points better, respectively, than SCWRL. Compared with NCN, which has the best accuracy in the literature, OPUS-Rota is 1.6 percentage points better for overall chi (1+2) but 0.3 percentage points weaker for overall chi (1). Hence, our algorithm is much more accurate than SCWRL with similar execution speed, and it has accuracy comparable to or better than the most accurate methods in the literature, but with a runtime that is one or two orders of magnitude shorter. In addition, OPUS-Rota consistently outperforms SCWRL on the Wallner and Elofsson homology-modeling benchmark set when the sequence identity is greater than 40%. We hope that OPUS-Rota will contribute to high-accuracy structure refinement, and the computer program is freely available for academic users.
Collapse
Affiliation(s)
- Mingyang Lu
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | |
Collapse
|
18
|
Vila JA, Scheraga HA. Factors affecting the use of 13C(alpha) chemical shifts to determine, refine, and validate protein structures. Proteins 2008; 71:641-54. [PMID: 17975838 DOI: 10.1002/prot.21726] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Interest centers here on the analysis of two different, but related, phenomena that affect side-chain conformations and consequently 13C(alpha) chemical shifts and their applications to determine, refine, and validate protein structures. The first is whether 13C(alpha) chemical shifts, computed at the DFT level of approximation with charged residues is a better approximation of observed 13C(alpha) chemical shifts than those computed with neutral residues for proteins in solution. Accurate computation of 13C(alpha) chemical shifts requires a proper representation of the charges, which might not take on integral values. For this analysis, the charges for 139 conformations of the protein ubiquitin were determined by explicit consideration of protein binding equilibria, at a given pH, that is, by exploring the 2(xi) possible ionization states of the whole molecule, with xi being the number of ionizable groups. The results of this analysis, as revealed by the shielding/deshielding of the 13C(alpha) nucleus, indicated that: (i) there is a significant difference in the computed 13C(alpha) chemical shifts, between basic and acidic groups, as a function of the degree of charge of the side chain; (ii) this difference is attributed to the distance between the ionizable groups and the 13C(alpha) nucleus, which is shorter for the acidic Asp and Glu groups as compared with that for the basic Lys and Arg groups; and (iii) the use of neutral, rather than charged, basic and acidic groups is a better approximation of the observed 13C(alpha) chemical shifts of a protein in solution. The second is how side-chain flexibility influences computed 13C(alpha) chemical shifts in an additional set of ubiquitin conformations, in which the side chains are generated from an NMR-derived structure with the backbone conformation assumed to be fixed. The 13C(alpha) chemical shift of a given amino acid residue in a protein is determined, mainly, by its own backbone and side-chain torsional angles, independent of the neighboring residues; the conformation of a given residue itself, however, depends on the environment of this residue and, hence, on the whole protein structure. As a consequence, this analysis reveals the role and impact of an accurate side-chain computation in the determination and refinement of protein conformation. The results of this analysis are: (i) a lower error between computed and observed 13C(alpha) chemical shifts (by up to 3.7 ppm), was found for approximately 68% and approximately 63% of all ionizable residues and all non-Ala/Pro/Gly residues, respectively, in the additional set of conformations, compared with results for the model from which the set was derived; and (ii) all the additional conformations exhibit a lower root-mean-square-deviation (1.97 ppm < or = rmsd < or = 2.13 ppm), between computed and observed 13C(alpha) chemical shifts, than the rmsd (2.32 ppm) computed for the starting conformation from which this additional set was derived. As a validation test, an analysis of the additional set of ubiquitin conformations, comparing computed and observed values of both 13C(alpha) chemical shifts and chi(1) torsional angles (given by the vicinal coupling constants, 3J(N-Cgamma) and 3J(C'-Cgamma), is discussed.
Collapse
Affiliation(s)
- Jorge A Vila
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853-1301, USA
| | | |
Collapse
|
19
|
Nilmeier J, Jacobson M. Multiscale Monte Carlo Sampling of Protein Sidechains: Application to Binding Pocket Flexibility. J Chem Theory Comput 2008; 4:835-846. [PMID: 19119325 DOI: 10.1021/ct700334a] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a Monte Carlo sidechain sampling procedure and apply it to assessing the flexibility of protein binding pockets. We implemented a multiple "time step" Monte Carlo algorithm to optimize sidechain sampling with a surface generalized Born implicit solvent model. In this approach, certain forces (those due to long-range electrostatics and the implicit solvent model) are updated infrequently, in "outer steps", while short-range forces (covalent, local nonbonded interactions) are updated at every "inner step". Two multistep protocols were studied. The first protocol rigorously obeys detailed balance, and the second protocol introduces an approximation to the solvation term that increases the acceptance ratio. The first protocol gives a 10-fold improvement over a protocol that does not use multiple time steps, while the second protocol generates comparable ensembles and gives a 15-fold improvement. A range of 50-200 inner steps per outer step was found to give optimal performance for both protocols. The resultant method is a practical means to assess sidechain flexibility in ligand binding pockets, as we illustrate with proof-of-principle calculations on six proteins: DB3 antibody, thermolysin, estrogen receptor, PPAR-γ, PI3 kinase, and CDK2. The resulting sidechain ensembles of the apo binding sites correlate well with known induced fit conformational changes and provide insights into binding pocket flexibility.
Collapse
Affiliation(s)
- Jerome Nilmeier
- Graduate Group in Biophysics, University of California at San Francisco, San Francisco, California 94158-2517
| | | |
Collapse
|
20
|
Abstract
We describe an automated method for the modeling of point mutations in protein structures. The protein is represented by all non-hydrogen atoms. The scoring function consists of several types of physical potential energy terms and homology-derived restraints. The optimization method implements a combination of conjugate gradient minimization and molecular dynamics with simulated annealing. The testing set consists of 717 pairs of known protein structures differing by a single mutation. Twelve variations of the scoring function were tested in three different environments of the mutated residue. The best-performing protocol optimizes all the atoms of the mutated residue, with respect to a scoring function that includes molecular mechanics energy terms for bond distances, angles, dihedral angles, peptide bond planarity, and non-bonded atomic contacts represented by Lennard-Jones potential, dihedral angle restraints derived from the aligned homologous structure, and a statistical potential for non-bonded atomic interactions extracted from a large set of known protein structures. The current method compares favorably with other tested approaches, especially when predicting long and flexible side-chains. In addition to the thoroughness of the conformational search, sampled degrees of freedom, and the scoring function type, the accuracy of the method was also evaluated as a function of the flexibility of the mutated side-chain, the relative volume change of the mutated residue, and its residue type. The results suggest that further improvement is likely to be achieved by concentrating on the improvement of the scoring function, in addition to or instead of increasing the variety of sampled conformations.
Collapse
Affiliation(s)
- Eric Feyfant
- Wyeth Research, Chemical and Screening Sciences, Cambridge, Massachusetts 02421, USA
| | | | | |
Collapse
|
21
|
Kargatov AM, Efimov AV. Side-chain rotamers in protein α-α hairpins and a mechanism of their selection. Mol Biol 2007. [DOI: 10.1134/s0026893307050135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|