1
|
Kolesnikov ES, Xiong Y, Onufriev AV. Implicit Solvent with Explicit Ions Generalized Born Model in Molecular Dynamics: Application to DNA. J Chem Theory Comput 2024; 20:8724-8739. [PMID: 39283928 DOI: 10.1021/acs.jctc.4c00833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
The ion atmosphere surrounding highly charged biomolecules, such as nucleic acids, is crucial for their dynamics, structure, and interactions. Here, we develop an approach for the explicit treatment of ions within an implicit solvent framework suitable for atomistic simulations of biomolecules. The proposed implicit solvent/explicit ions model, GBION, is based on a modified generalized Born (GB) model; it includes separate, modified GB terms for solute-ion and ion-ion interactions. The model is implemented in the AMBER package (version 24), and its performance is thoroughly investigated in atomistic molecular dynamics (MD) simulations of double-stranded DNA on a microsecond time scale. The aggregate characteristics of monovalent (Na+ and K+) and trivalent (Cobalt Hexammine, CoHex3+) counterion distributions around double-stranded DNA predicted by the model are in reasonable agreement with the experiment (where available), all-atom explicit water MD simulations, and the expectation from the Manning condensation theory. The radial distributions of monovalent cations around DNA are reasonably close to the ones obtained using the explicit water model: expressed in units of energy, the maximum deviations of local ion concentrations from the explicit solvent reference are within 1 kBT, comparable to the corresponding deviations expected between different established explicit water models. The proposed GBION model is able to simulate DNA fragments in a large volume of solvent with explicit ions with little additional computational overhead compared with the fully implicit GB treatment of ions. Ions simulated using the developed model explore conformational space at least 2 orders of magnitude faster than in the explicit solvent. These advantages allowed us to observe and explore an unexpected "stacking" mode of DNA condensation in the presence of trivalent counterions (CoHex3+) that was revealed by recent experiments.
Collapse
Affiliation(s)
- Egor S Kolesnikov
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Yeyue Xiong
- Department of Biomedical Engineering and Mechanics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
2
|
Opuu V, Nigro G, Lazennec‐Schurdevin C, Mechulam Y, Schmitt E, Simonson T. Redesigning methionyl-tRNA synthetase for β-methionine activity with adaptive landscape flattening and experiments. Protein Sci 2023; 32:e4738. [PMID: 37518893 PMCID: PMC10451022 DOI: 10.1002/pro.4738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/21/2023] [Accepted: 07/23/2023] [Indexed: 08/01/2023]
Abstract
Amino acids (AAs) with a noncanonical backbone would be a valuable tool for protein engineering, enabling new structural motifs and building blocks. To incorporate them into an expanded genetic code, the first, key step is to obtain an appropriate aminoacyl-tRNA synthetase. Currently, directed evolution is not available to optimize AAs with noncanonical backbones, since an appropriate selective pressure has not been discovered. Computational protein design (CPD) is an alternative. We used a new CPD method to redesign MetRS and increase its activity towards β-Met, which has an extra backbone methylene. The new method considered a few active site positions for design and used a Monte Carlo exploration of the corresponding sequence space. During the exploration, a bias energy was adaptively learned, such that the free energy landscape of the apo enzyme was flattened. Enzyme variants could then be sampled, in the presence of the ligand and the bias energy, according to their β-Met binding affinities. Eighteen predicted variants were chosen for experimental testing; 10 exhibited detectable activity for β-Met adenylation. Top predicted hits were characterized experimentally in detail. Dissociation constants, catalytic rates, and Michaelis constants for both α-Met and β-Met were measured. The best mutant retained a preference for α-Met over β-Met; however, the preference was reduced, compared to the wildtype, by a factor of 29. For this mutant, high resolution crystal structures were obtained in complex with both α-Met and β-Met, indicating that the predicted, active conformation of β-Met in the active site was retained.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Giuliano Nigro
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Christine Lazennec‐Schurdevin
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Yves Mechulam
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Emmanuelle Schmitt
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| |
Collapse
|
3
|
Wang KW, Lee J, Zhang H, Suh D, Im W. CHARMM-GUI Implicit Solvent Modeler for Various Generalized Born Models in Different Simulation Programs. J Phys Chem B 2022; 126:7354-7364. [PMID: 36117287 DOI: 10.1021/acs.jpcb.2c05294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Implicit solvent models are widely used because they are advantageous to speed up simulations by drastically decreasing the number of solvent degrees of freedom, which allows one to achieve long simulation time scales for large system sizes. CHARMM-GUI, a web-based platform, has been developed to support the setup of complex multicomponent molecular systems and prepare input files. This study describes an Implicit Solvent Modeler (ISM) in CHARMM-GUI for various generalized Born (GB) implicit solvent simulations in different molecular dynamics programs such as AMBER, CHARMM, GENESIS, NAMD, OpenMM, and Tinker. The GB models available in ISM include GB-HCT, GB-OBC, GB-neck, GBMV, and GBSW with the CHARMM and Amber force fields for protein, DNA, RNA, glycan, and ligand systems. Using the system and input files generated by ISM, implicit solvent simulations of protein, DNA, and RNA systems produce similar results for different simulation packages with the same input information. Protein-ligand systems are also considered to further validate the systems and input files generated by ISM. Simple ligand root-mean-square deviation (RMSD) and molecular mechanics generalized Born surface area (MM/GBSA) calculations show that the performance of implicit simulations is better than docking and can be used for early stage ligand screening. These reasonable results indicate that ISM is a useful and reliable tool to provide various implicit solvent simulation applications.
Collapse
Affiliation(s)
- Kye Won Wang
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jumin Lee
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Han Zhang
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Donghyuk Suh
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Wonpil Im
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| |
Collapse
|
4
|
Opuu V, Mignon D, Simonson T. Knowledge-Based Unfolded State Model for Protein Design. Methods Mol Biol 2022; 2405:403-424. [PMID: 35298824 DOI: 10.1007/978-1-0716-1855-4_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The design of proteins and miniproteins is an important challenge. Designed variants should be stable, meaning the folded/unfolded free energy difference should be large enough. Thus, the unfolded state plays a central role. An extended peptide model is often used, where side chains interact with solvent and nearby backbone, but not each other. The unfolded energy is then a function of sequence composition only and can be empirically parametrized. If the space of sequences is explored with a Monte Carlo procedure, protein variants will be sampled according to a well-defined Boltzmann probability distribution. We can then choose unfolded model parameters to maximize the probability of sampling native-like sequences. This leads to a well-defined maximum likelihood framework. We present an iterative algorithm that follows the likelihood gradient. The method is presented in the context of our Proteus software, as a detailed downloadable tutorial. The unfolded model is combined with a folded model that uses molecular mechanics and a Generalized Born solvent. It was optimized for three PDZ domains and then used to redesign them. The sequences sampled are native-like and similar to a recent PDZ design study that was experimentally validated.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France.
| |
Collapse
|
5
|
Michael E, Polydorides S, Simonson T, Archontis G. Hybrid MC/MD for protein design. J Chem Phys 2021; 153:054113. [PMID: 32770896 DOI: 10.1063/5.0013320] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Computational protein design relies on simulations of a protein structure, where selected amino acids can mutate randomly, and mutations are selected to enhance a target property, such as stability. Often, the protein backbone is held fixed and its degrees of freedom are modeled implicitly to reduce the complexity of the conformational space. We present a hybrid method where short molecular dynamics (MD) segments are used to explore conformations and alternate with Monte Carlo (MC) moves that apply mutations to side chains. The backbone is fully flexible during MD. As a test, we computed side chain acid/base constants or pKa's in five proteins. This problem can be considered a special case of protein design, with protonation/deprotonation playing the role of mutations. The solvent was modeled as a dielectric continuum. Due to cost, in each protein we allowed just one side chain position to change its protonation state and the other position to change its type or mutate. The pKa's were computed with a standard method that scans a range of pH values and with a new method that uses adaptive landscape flattening (ALF) to sample all protonation states in a single simulation. The hybrid method gave notably better accuracy than standard, fixed-backbone MC. ALF decreased the computational cost a factor of 13.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| |
Collapse
|
6
|
Mignon D, Druart K, Michael E, Opuu V, Polydorides S, Villa F, Gaillard T, Panel N, Archontis G, Simonson T. Physics-Based Computational Protein Design: An Update. J Phys Chem A 2020; 124:10637-10648. [DOI: 10.1021/acs.jpca.0c07605] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Karen Druart
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Eleni Michael
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Gaillard
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
7
|
Opuu V, Sun YJ, Hou T, Panel N, Fuentes EJ, Simonson T. A physics-based energy function allows the computational redesign of a PDZ domain. Sci Rep 2020; 10:11150. [PMID: 32636412 PMCID: PMC7341745 DOI: 10.1038/s41598-020-67972-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/08/2020] [Indexed: 11/30/2022] Open
Abstract
Computational protein design (CPD) can address the inverse folding problem, exploring a large space of sequences and selecting ones predicted to fold. CPD was used previously to redesign several proteins, employing a knowledge-based energy function for both the folded and unfolded states. We show that a PDZ domain can be entirely redesigned using a "physics-based" energy for the folded state and a knowledge-based energy for the unfolded state. Thousands of sequences were generated by Monte Carlo simulation. Three were chosen for experimental testing, based on their low energies and several empirical criteria. All three could be overexpressed and had native-like circular dichroism spectra and 1D-NMR spectra typical of folded structures. Two had upshifted thermal denaturation curves when a peptide ligand was present, indicating binding and suggesting folding to a correct, PDZ structure. Evidently, the physical principles that govern folded proteins, with a dash of empirical post-filtering, can allow successful whole-protein redesign.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Young Joo Sun
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Titus Hou
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA.
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
| |
Collapse
|
8
|
Adaptive landscape flattening allows the design of both enzyme: Substrate binding and catalytic power. PLoS Comput Biol 2020; 16:e1007600. [PMID: 31917825 PMCID: PMC7041857 DOI: 10.1371/journal.pcbi.1007600] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 02/25/2020] [Accepted: 12/11/2019] [Indexed: 01/30/2023] Open
Abstract
Designed enzymes are of fundamental and technological interest. Experimental directed evolution still has significant limitations, and computational approaches are a complementary route. A designed enzyme should satisfy multiple criteria: stability, substrate binding, transition state binding. Such multi-objective design is computationally challenging. Two recent studies used adaptive importance sampling Monte Carlo to redesign proteins for ligand binding. By first flattening the energy landscape of the apo protein, they obtained positive design for the bound state and negative design for the unbound. We have now extended the method to design an enzyme for specific transition state binding, i.e., for its catalytic power. We considered methionyl-tRNA synthetase (MetRS), which attaches methionine (Met) to its cognate tRNA, establishing codon identity. Previously, MetRS and other synthetases have been redesigned by experimental directed evolution to accept noncanonical amino acids as substrates, leading to genetic code expansion. Here, we have redesigned MetRS computationally to bind several ligands: the Met analog azidonorleucine, methionyl-adenylate (MetAMP), and the activated ligands that form the transition state for MetAMP production. Enzyme mutants known to have azidonorleucine activity were recovered by the design calculations, and 17 mutants predicted to bind MetAMP were characterized experimentally and all found to be active. Mutants predicted to have low activation free energies for MetAMP production were found to be active and the predicted reaction rates agreed well with the experimental values. We suggest the present method should become the paradigm for computational enzyme design. Designed enzymes are of major interest. Experimental directed evolution still has significant limitations, and computational approaches are another route. Enzymes must be stable, bind substrates, and be powerful catalysts. It is challenging to design for all these properties. A method to design substrate binding was proposed recently. It used an adaptive Monte Carlo method to explore mutations of a few amino acids near the substrate. A bias energy was gradually “learned” such that, in the absence of the ligand, the simulation visited most of the possible protein mutations with comparable probabilities. Remarkably, a simulation of the protein:ligand complex, including the bias, will then preferentially sample tight-binding sequences. We generalized the method to design binding specificity. We tested it for the methionyl-tRNA synthetase enzyme, which has been engineered in order to expand the genetic code. We redesigned the enzyme to obtain variants with low activation free energies for the catalytic step. The variants proposed by the simulations were shown experimentally to be active, and the predicted activation free energies were in reasonable agreement with the experimental values. We expect the new method will become the paradigm for computational enzyme design.
Collapse
|
9
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
10
|
Xu G, Ma T, Du J, Wang Q, Ma J. OPUS-Rota2: An Improved Fast and Accurate Side-Chain Modeling Method. J Chem Theory Comput 2019; 15:5154-5160. [PMID: 31412199 DOI: 10.1021/acs.jctc.9b00309] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Side-chain modeling plays a critical role in protein structure prediction. However, in many current methods, balancing the speed and accuracy is still challenging. In this paper, on the basis of our previous work OPUS-Rota (Protein Sci. 2008, 17, 1576-1585), we introduce a new side-chain modeling method, OPUS-Rota2, which is tested on both a 65-protein test set (DB65) in the OPUS-Rota paper and a 379-protein test set (DB379) in the SCWRL4 paper. If the main chain is native, OPUS-Rota2 is more accurate than OPUS-Rota, SCWRL4, and OSCAR-star but slightly less accurate than OSCAR-o. Also, if the main chain is non-native, OPUS-Rota2 is more accurate than any other method. Moreover, OPUS-Rota2 is significantly faster than any other method, in particular, 2 orders of magnitude faster than OSCAR-o. Thus, the combination of higher accuracy and speed of OPUS-Rota2 in modeling side chains on both the native and non-native main chains makes OPUS-Rota2 a very useful tool in protein structure modeling.
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China
| | | | - Junqing Du
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China.,Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States.,School of Life Sciences , Fudan University , Shanghai 200433 , China
| |
Collapse
|
11
|
Abstract
It would often be useful in computer simulations to use an implicit description of solvation effects, instead of explicitly representing the individual solvent molecules. Continuum dielectric models often work well in describing the thermodynamic aspects of aqueous solvation and can be very efficient compared to the explicit treatment of the solvent. Here, we review a particular class of so-called fast implicit solvent models, generalized Born (GB) models, which are widely used for molecular dynamics (MD) simulations of proteins and nucleic acids. These approaches model hydration effects and provide solvent-dependent forces with efficiencies comparable to molecular-mechanics calculations on the solute alone; as such, they can be incorporated into MD or other conformational searching strategies in a straightforward manner. The foundations of the GB model are reviewed, followed by examples of newer, emerging models and examples of important applications. We discuss their strengths and weaknesses, both for fidelity to the underlying continuum model and for the ability to replace explicit consideration of solvent molecules in macromolecular simulations.
Collapse
Affiliation(s)
- Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24060, USA;
| | - David A Case
- Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA;
| |
Collapse
|
12
|
Villa F, Simonson T. Protein pKa’s from Adaptive Landscape Flattening Instead of Constant-pH Simulations. J Chem Theory Comput 2018; 14:6714-6721. [DOI: 10.1021/acs.jctc.8b00970] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
13
|
Charpentier A, Mignon D, Barbe S, Cortes J, Schiex T, Simonson T, Allouche D. Variable Neighborhood Search with Cost Function Networks To Solve Large Computational Protein Design Problems. J Chem Inf Model 2018; 59:127-136. [DOI: 10.1021/acs.jcim.8b00510] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - David Mignon
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - Sophie Barbe
- Laboratoire d’Ingénierie des Systèmes Biologiques et Procédés, LISBP, Université de Toulouse, CNRS, INRA, INSA, 31077 Toulouse, France
| | - Juan Cortes
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| | - Thomas Schiex
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - David Allouche
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| |
Collapse
|
14
|
Villa F, Panel N, Chen X, Simonson T. Adaptive landscape flattening in amino acid sequence space for the computational design of protein:peptide binding. J Chem Phys 2018; 149:072302. [PMID: 30134674 DOI: 10.1063/1.5022249] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For the high throughput design of protein:peptide binding, one must explore a vast space of amino acid sequences in search of low binding free energies. This complex problem is usually addressed with either simple heuristic scoring or expensive sequence enumeration schemes. Far more efficient than enumeration is a recent Monte Carlo approach that adaptively flattens the energy landscape in sequence space of the unbound peptide and provides formally exact binding free energy differences. The method allows the binding free energy to be used directly as the design criterion. We propose several improvements that allow still more efficient sampling and can address larger design problems. They include the use of Replica Exchange Monte Carlo and landscape flattening for both the unbound and bound peptides. We used the method to design peptides that bind to the PDZ domain of the Tiam1 signaling protein and could serve as inhibitors of its activity. Four peptide positions were allowed to mutate freely. Almost 75 000 peptide variants were processed in two simulations of 109 steps each that used 1 CPU hour on a desktop machine. 96% of the theoretical sequence space was sampled. The relative binding free energies agreed qualitatively with values from experiment. The sampled sequences agreed qualitatively with an experimental library of Tiam1-binding peptides. The main assumption limiting accuracy is the fixed backbone approximation, which could be alleviated in future work by using increased computational resources and multi-backbone designs.
Collapse
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
15
|
Onufriev AV, Izadi S. Water models for biomolecular simulations. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2017. [DOI: 10.1002/wcms.1347] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Alexey V. Onufriev
- Department of Physics; Virginia Tech; Blacksburg VA USA
- Department of Computer Science; Virginia Tech; Blacksburg VA USA
- Center for Soft Matter and Biological Physics; Virginia Tech; Blacksburg VA USA
| | - Saeed Izadi
- Early Stage Pharmaceutical Development; Genentech Inc.; South San Francisco, CA USA
| |
Collapse
|
16
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
17
|
Panel N, Sun YJ, Fuentes EJ, Simonson T. A Simple PB/LIE Free Energy Function Accurately Predicts the Peptide Binding Specificity of the Tiam1 PDZ Domain. Front Mol Biosci 2017; 4:65. [PMID: 29018806 PMCID: PMC5623046 DOI: 10.3389/fmolb.2017.00065] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 09/14/2017] [Indexed: 11/13/2022] Open
Abstract
PDZ domains generally bind short amino acid sequences at the C-terminus of target proteins, and short peptides can be used as inhibitors or model ligands. Here, we used experimental binding assays and molecular dynamics simulations to characterize 51 complexes involving the Tiam1 PDZ domain and to test the performance of a semi-empirical free energy function. The free energy function combined a Poisson-Boltzmann (PB) continuum electrostatic term, a van der Waals interaction energy, and a surface area term. Each term was empirically weighted, giving a Linear Interaction Energy or “PB/LIE” free energy. The model yielded a mean unsigned deviation of 0.43 kcal/mol and a Pearson correlation of 0.64 between experimental and computed free energies, which was superior to a Null model that assumes all complexes have the same affinity. Analyses of the models support several experimental observations that indicate the orientation of the α2 helix is a critical determinant for peptide specificity. The models were also used to predict binding free energies for nine new variants, corresponding to point mutants of the Syndecan1 and Caspr4 peptides. The predictions did not reveal improved binding; however, they suggest that an unnatural amino acid could be used to increase protease resistance and peptide lifetimes in vivo. The overall performance of the model should allow its use in the design of new PDZ ligands in the future.
Collapse
Affiliation(s)
- Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Young Joo Sun
- Department of Biochemistry, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, IA, United States
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, IA, United States.,Holden Comprehensive Cancer Center, Iowa City, IA, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
18
|
Michael E, Polydorides S, Simonson T, Archontis G. Simple models for nonpolar solvation: Parameterization and testing. J Comput Chem 2017; 38:2509-2519. [PMID: 28786118 DOI: 10.1002/jcc.24910] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2017] [Revised: 07/19/2017] [Accepted: 07/20/2017] [Indexed: 12/13/2022]
Abstract
Implicit solvent models are important for many biomolecular simulations. The polarity of aqueous solvent is essential and qualitatively captured by continuum electrostatics methods like Generalized Born (GB). However, GB does not account for the solvent-induced interactions between exposed hydrophobic sidechains or solute-solvent dispersion interactions. These "nonpolar" effects are often modeled through surface area (SA) energy terms, which lack realism, create mathematical singularities, and have a many-body character. We have explored an alternate, Lazaridis-Karplus (LK) gaussian energy density for nonpolar effects and a dispersion (DI) energy term proposed earlier, associated with GB electrostatics. We parameterized several combinations of GB, SA, LK, and DI energy terms, to reproduce 62 small molecule solvation free energies, 387 protein stability changes due to point mutations, and the structures of 8 protein loops. With optimized parameters, the models all gave similar results, with GBLK and GBDILK giving no performance loss compared to GBSA, and mean errors of 1.7 kcal/mol for the stability changes and 2 Å deviations for the loop conformations. The optimized GBLK model gave poor results in MD of the Trpcage mini-protein, but parameters optimized specifically for MD performed well for Trpcage and three other small proteins. Overall, the LK and DI nonpolar terms are valid alternatives to SA treatments for a range of applications. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus.,Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| |
Collapse
|
19
|
Villa F, Mignon D, Polydorides S, Simonson T. Comparing pairwise-additive and many-body generalized Born models for acid/base calculations and protein design. J Comput Chem 2017; 38:2396-2410. [PMID: 28749575 DOI: 10.1002/jcc.24898] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 06/30/2017] [Accepted: 07/06/2017] [Indexed: 12/13/2022]
Abstract
Generalized Born (GB) solvent models are common in acid/base calculations and protein design. With GB, the interaction between a pair of solute atoms depends on the shape of the protein/solvent boundary and, therefore, the positions of all solute atoms, so that GB is a many-body potential. For compute-intensive applications, the model is often simplified further, by introducing a mean, native-like protein/solvent boundary, which removes the many-body property. We investigate a method for both acid/base calculations and protein design that uses Monte Carlo simulations in which side chains can explore rotamers, bind/release protons, or mutate. The fluctuating protein/solvent dielectric boundary is treated in a way that is numerically exact (within the GB framework), in contrast to a mean boundary. Its originality is that it captures the many-body character while retaining the residue-pairwise complexity given by a fixed boundary. The method is implemented in the Proteus protein design software. It yields a slight but systematic improvement for acid/base constants in nine proteins and a significant improvement for the computational design of three PDZ domains. It eliminates a source of model uncertainty, which will facilitate the analysis of other model limitations. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Francesco Villa
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - David Mignon
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Savvas Polydorides
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| | - Thomas Simonson
- Ecole Polytechnique, Laboratoire de Biochimie (CNRS UMR7654), Palaiseau, 91128, France
| |
Collapse
|
20
|
Mignon D, Panel N, Chen X, Fuentes EJ, Simonson T. Computational Design of the Tiam1 PDZ Domain and Its Ligand Binding. J Chem Theory Comput 2017; 13:2271-2289. [PMID: 28394603 DOI: 10.1021/acs.jctc.6b01255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PDZ domains direct protein-protein interactions and serve as models for protein design. Here, we optimized a protein design energy function for the Tiam1 and Cask PDZ domains that combines a molecular mechanics energy, Generalized Born solvent, and an empirical unfolded state model. Designed sequences were recognized as PDZ domains by the Superfamily fold recognition tool and had similarity scores comparable to natural PDZ sequences. The optimized model was used to redesign the two PDZ domains, by gradually varying the chemical potential of hydrophobic amino acids; the tendency of each position to lose or gain a hydrophobic character represents a novel hydrophobicity index. We also redesigned four positions in the Tiam1 PDZ domain involved in peptide binding specificity. The calculated affinity differences between designed variants reproduced experimental data and suggest substitutions with altered specificities.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. & Lucille A. Carver College of Medicine and Holden Comprehensive Cancer Center, University of Iowa , Iowa City, Iowa 52242-1109, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| |
Collapse
|
21
|
Izadi S, Anandakrishnan R, Onufriev AV. Implicit Solvent Model for Million-Atom Atomistic Simulations: Insights into the Organization of 30-nm Chromatin Fiber. J Chem Theory Comput 2016; 12:5946-5959. [PMID: 27748599 PMCID: PMC5649046 DOI: 10.1021/acs.jctc.6b00712] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Molecular dynamics (MD) simulations based on the implicit solvent generalized Born (GB) models can provide significant computational advantages over the traditional explicit solvent simulations. However, the standard GB becomes prohibitively expensive for all-atom simulations of large structures; the model scales poorly, ∼n2, with the number of solute atoms. Here we combine our recently developed optimal point charge approximation (OPCA) with the hierarchical charge partitioning (HCP) approximation to present an ∼n log n multiscale, yet fully atomistic, GB model (GB-HCPO). The HCP approximation exploits the natural organization of biomolecules (atoms, groups, chains, and complexes) to partition the structure into multiple hierarchical levels of components. OPCA approximates the charge distribution for each of these components by a small number of point charges so that the low order multipole moments of these components are optimally reproduced. The approximate charges are then used for computing electrostatic interactions with distant components, while the full set of atomic charges are used for nearby components. We show that GB-HCPO can deliver up to 2 orders of magnitude speedup compared to the standard GB, with minimal impact on its accuracy. For large structures, GB-HCPO can approach the same nominal speed, as in nanoseconds per day, as the highly optimized explicit-solvent simulation based on particle mesh Ewald (PME). The increase in the nominal simulation speed, relative to the standard GB, coupled with substantially faster sampling of conformational space, relative to the explicit solvent, makes GB-HCPO a suitable candidate for MD simulation of large atomistic systems in implicit solvent. As a practical demonstration, we use GB-HCPO simulation to refine a ∼1.16 million atom structure of 30 nm chromatin fiber (40 nucleosomes). The refined structure suggests important details about spatial organization of the linker DNA and the histone tails in the fiber: (1) the linker DNA fills the core region, allowing the H3 histone tails to interact with the linker DNA, which is consistent with experiment; (2) H3 and H4 tails are found mostly in the core of the structure, closer to the helical axis of the fiber, while H2A and H2B are mostly solvent exposed. Potential functional consequences of these findings are discussed. GB-HCPO is implemented in the open source MD software NAB in Amber 2016.
Collapse
Affiliation(s)
- Saeed Izadi
- Department of Biomedical Engineering and Mechanics, ‡Biomedical Division, Edward Via College of Osteopathic Medicine, ¶Department of Computer Science and Physics, and §Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute and State University , Blacksburg, Virginia 24061, United States
| | - Ramu Anandakrishnan
- Department of Biomedical Engineering and Mechanics, ‡Biomedical Division, Edward Via College of Osteopathic Medicine, ¶Department of Computer Science and Physics, and §Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute and State University , Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Biomedical Engineering and Mechanics, ‡Biomedical Division, Edward Via College of Osteopathic Medicine, ¶Department of Computer Science and Physics, and §Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute and State University , Blacksburg, Virginia 24061, United States
| |
Collapse
|
22
|
Druart K, Bigot J, Audit E, Simonson T. A Hybrid Monte Carlo Scheme for Multibackbone Protein Design. J Chem Theory Comput 2016; 12:6035-6048. [DOI: 10.1021/acs.jctc.6b00421] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Karen Druart
- Laboratoire
de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Julien Bigot
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Edouard Audit
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Thomas Simonson
- Laboratoire
de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
23
|
Topham CM, Barbe S, André I. An Atomistic Statistically Effective Energy Function for Computational Protein Design. J Chem Theory Comput 2016; 12:4146-68. [PMID: 27341125 DOI: 10.1021/acs.jctc.6b00090] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Shortcomings in the definition of effective free-energy surfaces of proteins are recognized to be a major contributory factor responsible for the low success rates of existing automated methods for computational protein design (CPD). The formulation of an atomistic statistically effective energy function (SEEF) suitable for a wide range of CPD applications and its derivation from structural data extracted from protein domains and protein-ligand complexes are described here. The proposed energy function comprises nonlocal atom-based and local residue-based SEEFs, which are coupled using a novel atom connectivity number factor to scale short-range, pairwise, nonbonded atomic interaction energies and a surface-area-dependent cavity energy term. This energy function was used to derive additional SEEFs describing the unfolded-state ensemble of any given residue sequence based on computed average energies for partially or fully solvent-exposed fragments in regions of irregular structure in native proteins. Relative thermal stabilities of 97 T4 bacteriophage lysozyme mutants were predicted from calculated energy differences for folded and unfolded states with an average unsigned error (AUE) of 0.84 kcal mol(-1) when compared to experiment. To demonstrate the utility of the energy function for CPD, further validation was carried out in tests of its capacity to recover cognate protein sequences and to discriminate native and near-native protein folds, loop conformers, and small-molecule ligand binding poses from non-native benchmark decoys. Experimental ligand binding free energies for a diverse set of 80 protein complexes could be predicted with an AUE of 2.4 kcal mol(-1) using an additional energy term to account for the loss in ligand configurational entropy upon binding. The atomistic SEEF is expected to improve the accuracy of residue-based coarse-grained SEEFs currently used in CPD and to extend the range of applications of extant atom-based protein statistical potentials.
Collapse
Affiliation(s)
- Christopher M Topham
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Sophie Barbe
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Isabelle André
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| |
Collapse
|
24
|
Mignon D, Simonson T. Comparing three stochastic search algorithms for computational protein design: Monte Carlo, replica exchange Monte Carlo, and a multistart, steepest-descent heuristic. J Comput Chem 2016; 37:1781-93. [PMID: 27197555 DOI: 10.1002/jcc.24393] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 02/26/2016] [Accepted: 03/27/2016] [Indexed: 01/11/2023]
Abstract
Computational protein design depends on an energy function and an algorithm to search the sequence/conformation space. We compare three stochastic search algorithms: a heuristic, Monte Carlo (MC), and a Replica Exchange Monte Carlo method (REMC). The heuristic performs a steepest-descent minimization starting from thousands of random starting points. The methods are applied to nine test proteins from three structural families, with a fixed backbone structure, a molecular mechanics energy function, and with 1, 5, 10, 20, 30, or all amino acids allowed to mutate. Results are compared to an exact, "Cost Function Network" method that identifies the global minimum energy conformation (GMEC) in favorable cases. The designed sequences accurately reproduce experimental sequences in the hydrophobic core. The heuristic and REMC agree closely and reproduce the GMEC when it is known, with a few exceptions. Plain MC performs well for most cases, occasionally departing from the GMEC by 3-4 kcal/mol. With REMC, the diversity of the sequences sampled agrees with exact enumeration where the latter is possible: up to 2 kcal/mol above the GMEC. Beyond, room temperature replicas sample sequences up to 10 kcal/mol above the GMEC, providing thermal averages and a solution to the inverse protein folding problem. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
25
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
26
|
Simonson T, Ye-Lehmann S, Palmai Z, Amara N, Wydau-Dematteis S, Bigan E, Druart K, Moch C, Plateau P. Redesigning the stereospecificity of tyrosyl-tRNA synthetase. Proteins 2016; 84:240-53. [PMID: 26676967 DOI: 10.1002/prot.24972] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 09/30/2015] [Accepted: 11/26/2015] [Indexed: 12/14/2022]
Abstract
D-Amino acids are largely excluded from protein synthesis, yet they are of great interest in biotechnology. Unnatural amino acids have been introduced into proteins using engineered aminoacyl-tRNA synthetases (aaRSs), and this strategy might be applicable to D-amino acids. Several aaRSs can aminoacylate their tRNA with a D-amino acid; of these, tyrosyl-tRNA synthetase (TyrRS) has the weakest stereospecificity. We use computational protein design to suggest active site mutations in Escherichia coli TyrRS that could increase its D-Tyr binding further, relative to L-Tyr. The mutations selected all modify one or more sidechain charges in the Tyr binding pocket. We test their effect by probing the aminoacyl-adenylation reaction through pyrophosphate exchange experiments. We also perform extensive alchemical free energy simulations to obtain L-Tyr/D-Tyr binding free energy differences. Agreement with experiment is good, validating the structural models and detailed thermodynamic predictions the simulations provide. The TyrRS stereospecificity proves hard to engineer through charge-altering mutations in the first and second coordination shells of the Tyr ammonium group. Of six mutants tested, two are active towards D-Tyr; one of these has an inverted stereospecificity, with a large preference for D-Tyr. However, its activity is low. Evidently, the TyrRS stereospecificity is robust towards charge rearrangements near the ligand. Future design may have to consider more distant and/or electrically neutral target mutations, and possibly design for binding of the transition state, whose structure however can only be modeled.
Collapse
Affiliation(s)
- Thomas Simonson
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | | | - Zoltan Palmai
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Najette Amara
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Sandra Wydau-Dematteis
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Erwan Bigan
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Karen Druart
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Clara Moch
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Pierre Plateau
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
27
|
Ghobadi AF, Letteri R, Parelkar SS, Zhao Y, Chan-Seng D, Emrick T, Jayaraman A. Dispersing Zwitterions into Comb Polymers for Nonviral Transfection: Experiments and Molecular Simulation. Biomacromolecules 2016; 17:546-57. [DOI: 10.1021/acs.biomac.5b01462] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Ahmadreza F. Ghobadi
- Department
of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716 United States
| | - Rachel Letteri
- Department
of Polymer Science and Engineering, University of Massachusetts, 120
Governors Drive, Amherst, Massachusetts 01003, United States
| | - Sangram S. Parelkar
- Department
of Polymer Science and Engineering, University of Massachusetts, 120
Governors Drive, Amherst, Massachusetts 01003, United States
| | - Yue Zhao
- Quantum
Beam Science Center, Japan Atomic Energy Agency, Tokai, Ibaraki 319-1195, Japan
| | - Delphine Chan-Seng
- Institut Charles
Sadron UPR22-CNRS, 23 rue du Loess, 67034 Strasbourg, France
| | - Todd Emrick
- Department
of Polymer Science and Engineering, University of Massachusetts, 120
Governors Drive, Amherst, Massachusetts 01003, United States
| | - Arthi Jayaraman
- Department
of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716 United States
- Department
of Materials Science and Engineering, University of Delaware, 201 DuPont
Hall, Newark, Delaware 19716 United States
| |
Collapse
|
28
|
Druart K, Palmai Z, Omarjee E, Simonson T. Protein:Ligand binding free energies: A stringent test for computational protein design. J Comput Chem 2015; 37:404-15. [PMID: 26503829 DOI: 10.1002/jcc.24230] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/29/2023]
Abstract
A computational protein design method is extended to allow Monte Carlo simulations where two ligands are titrated into a protein binding pocket, yielding binding free energy differences. These provide a stringent test of the physical model, including the energy surface and sidechain rotamer definition. As a test, we consider tyrosyl-tRNA synthetase (TyrRS), which has been extensively redesigned experimentally. We consider its specificity for its substrate l-tyrosine (l-Tyr), compared to the analogs d-Tyr, p-acetyl-, and p-azido-phenylalanine (ac-Phe, az-Phe). We simulate l- and d-Tyr binding to TyrRS and six mutants, and compare the structures and binding free energies to a more rigorous "MD/GBSA" procedure: molecular dynamics with explicit solvent for structures and a Generalized Born + Surface Area model for binding free energies. Next, we consider l-Tyr, ac- and az-Phe binding to six other TyrRS variants. The titration results are sensitive to the precise rotamer definition, which involves a short energy minimization for each sidechain pair to help relax bad contacts induced by the discrete rotamer set. However, when designed mutant structures are rescored with a standard GBSA energy model, results agree well with the more rigorous MD/GBSA. As a third test, we redesign three amino acid positions in the substrate coordination sphere, with either l-Tyr or d-Tyr as the ligand. For two, we obtain good agreement with experiment, recovering the wildtype residue when l-Tyr is the ligand and a d-Tyr specific mutant when d-Tyr is the ligand. For the third, we recover His with either ligand, instead of wildtype Gln.
Collapse
Affiliation(s)
- Karen Druart
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Zoltan Palmai
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Eyaz Omarjee
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
29
|
Gaillard T, Simonson T. Pairwise decomposition of an MMGBSA energy function for computational protein design. J Comput Chem 2014; 35:1371-87. [PMID: 24854675 DOI: 10.1002/jcc.23637] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 04/14/2014] [Accepted: 05/01/2014] [Indexed: 02/02/2023]
Abstract
Computational protein design (CPD) aims at predicting new proteins or modifying existing ones. The computational challenge is huge as it requires exploring an enormous sequence and conformation space. The difficulty can be reduced by considering a fixed backbone and a discrete set of sidechain conformations. Another common strategy consists in precalculating a pairwise energy matrix, from which the energy of any sequence/conformation can be quickly obtained. In this work, we examine the pairwise decomposition of protein MMGBSA energy functions from a general theoretical perspective, and an implementation proposed earlier for CPD. It includes a Generalized Born term, whose many-body character is overcome using an effective dielectric environment, and a Surface Area term, for which we present an improved pairwise decomposition. A detailed evaluation of the error introduced by the decomposition on the different energy components is performed. We show that the error remains reasonable, compared to other uncertainties.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
30
|
Kleinjung J, Fraternali F. Design and application of implicit solvent models in biomolecular simulations. Curr Opin Struct Biol 2014; 25:126-34. [PMID: 24841242 PMCID: PMC4045398 DOI: 10.1016/j.sbi.2014.04.003] [Citation(s) in RCA: 106] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Revised: 04/07/2014] [Accepted: 04/09/2014] [Indexed: 11/23/2022]
Abstract
Implicit solvent replaces explicit water by a potential of mean force. Popular models are SASA, VOL and Generalized Born. Implicit solvent is used in MD, protein modelling, folding, design, prediction and drug screening. Large-scale simulations allow for parametrisation via force matching. Application to nucleic acids and membranes is challenging.
We review implicit solvent models and their parametrisation by introducing the concepts and recent devlopments of the most popular models with a focus on parametrisation via force matching. An overview of recent applications of the solvation energy term in protein dynamics, modelling, design and prediction is given to illustrate the usability and versatility of implicit solvation in reproducing the physical behaviour of biomolecular systems. Limitations of implicit modes are discussed through the example of more challenging systems like nucleic acids and membranes.
Collapse
Affiliation(s)
- Jens Kleinjung
- Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, London NW7 1AA, United Kingdom
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King's College London, New Hunt's House, London SE1 1UL, United Kingdom.
| |
Collapse
|
31
|
Mukhopadhyay A, Aguilar BH, Tolokh IS, Onufriev AV. Introducing Charge Hydration Asymmetry into the Generalized Born Model. J Chem Theory Comput 2014; 10:1788-1794. [PMID: 24803871 PMCID: PMC3985468 DOI: 10.1021/ct4010917] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Indexed: 12/15/2022]
Abstract
The effect of charge hydration asymmetry (CHA)-non-invariance of solvation free energy upon solute charge inversion-is missing from the standard linear response continuum electrostatics. The proposed charge hydration asymmetric-generalized Born (CHA-GB) approximation introduces this effect into the popular generalized Born (GB) model. The CHA is added to the GB equation via an analytical correction that quantifies the specific propensity of CHA of a given water model; the latter is determined by the charge distribution within the water model. Significant variations in CHA seen in explicit water (TIP3P, TIP4P-Ew, and TIP5P-E) free energy calculations on charge-inverted "molecular bracelets" are closely reproduced by CHA-GB, with the accuracy similar to models such as SEA and 3D-RISM that go beyond the linear response. Compared against reference explicit (TIP3P) electrostatic solvation free energies, CHA-GB shows about a 40% improvement in accuracy over the canonical GB, tested on a diverse set of 248 rigid small neutral molecules (root mean square error, rmse = 0.88 kcal/mol for CHA-GB vs 1.24 kcal/mol for GB) and 48 conformations of amino acid analogs (rmse = 0.81 kcal/mol vs 1.26 kcal/mol). CHA-GB employs a novel definition of the dielectric boundary that does not subsume the CHA effects into the intrinsic atomic radii. The strategy leads to finding a new set of intrinsic atomic radii optimized for CHA-GB; these radii show physically meaningful variation with the atom type, in contrast to the radii set optimized for GB. Compared to several popular radii sets used with the original GB model, the new radii set shows better transferability between different classes of molecules.
Collapse
Affiliation(s)
| | - Boris H. Aguilar
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Igor S. Tolokh
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V. Onufriev
- Department
of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
32
|
Polydorides S, Simonson T. Monte Carlo simulations of proteins at constant pH with generalized Born solvent, flexible sidechains, and an effective dielectric boundary. J Comput Chem 2013; 34:2742-56. [PMID: 24122878 DOI: 10.1002/jcc.23450] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Revised: 09/04/2013] [Accepted: 09/08/2013] [Indexed: 12/11/2022]
Abstract
Titratable residues determine the acid/base behavior of proteins, strongly influencing their function; in addition, proton binding is a valuable reporter on electrostatic interactions. We describe a method for pK(a) calculations, using constant-pH Monte Carlo (MC) simulations to explore the space of sidechain conformations and protonation states, with an efficient and accurate generalized Born model (GB) for the solvent effects. To overcome the many-body dependency of the GB model, we use a "Native Environment" approximation, whose accuracy is shown to be good. It allows the precalculation and storage of interactions between all sidechain pairs, a strategy borrowed from computational protein design, which makes the MC simulations themselves very fast. The method is tested for 12 proteins and 167 titratable sidechains. It gives an rms error of 1.1 pH units, similar to the trivial "Null" model. The only adjustable parameter is the protein dielectric constant. The best accuracy is achieved for values between 4 and 8, a range that is physically plausible for a protein interior. For sidechains with large pKa shifts, ≥2, the rms error is 1.6, compared to 2.5 with the Null model and 1.5 with the empirical PROPKA method.
Collapse
Affiliation(s)
- Savvas Polydorides
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
33
|
Simonson T. What Is the Dielectric Constant of a Protein When Its Backbone Is Fixed? J Chem Theory Comput 2013; 9:4603-8. [DOI: 10.1021/ct400398e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
34
|
Simonson T, Gaillard T, Mignon D, Schmidt am Busch M, Lopes A, Amara N, Polydorides S, Sedano A, Druart K, Archontis G. Computational protein design: the Proteus software and selected applications. J Comput Chem 2013; 34:2472-84. [PMID: 24037756 DOI: 10.1002/jcc.23418] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/08/2013] [Accepted: 07/28/2013] [Indexed: 12/13/2022]
Abstract
We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Collapse
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, 91128, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Equilibrium and folding simulations of NS4B H2 in pure water and water/2,2,2-trifluoroethanol mixed solvent: examination of solvation models. J Mol Model 2013; 19:3931-9. [DOI: 10.1007/s00894-013-1933-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Accepted: 06/23/2013] [Indexed: 10/26/2022]
|
36
|
Designing electrostatic interactions in biological systems via charge optimization or combinatorial approaches: insights and challenges with a continuum electrostatic framework. Theor Chem Acc 2012. [DOI: 10.1007/s00214-012-1252-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
37
|
Ceres N, Pasi M, Lavery R. A Protein Solvation Model Based on Residue Burial. J Chem Theory Comput 2012; 8:2141-4. [DOI: 10.1021/ct3001552] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Nicoletta Ceres
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Université Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors, 69367 Lyon, France
| | - Marco Pasi
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Université Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors, 69367 Lyon, France
| | - Richard Lavery
- Bases Moléculaires
et Structurales des Systèmes
Infectieux, Université Lyon I/CNRS UMR 5086, IBCP, 7 Passage du Vercors, 69367 Lyon, France
| |
Collapse
|
38
|
Protein-water interactions in MD simulations: POPS/POPSCOMP solvent accessibility analysis, solvation forces and hydration sites. Methods Mol Biol 2012; 819:375-92. [PMID: 22183548 DOI: 10.1007/978-1-61779-465-0_23] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The effects of solvation on molecular recognition are investigated from different perspectives, ranging from methods to analyse explicit solvent dynamical behaviour at the protein surface to methods for the implicit treatment of solvent effects associated with the conformational behaviour of biomolecules. The here presented implicit solvation method is based on an analytical approximation of the Solvent Accessible Surface Area (SASA) of solute molecules, which is computationally efficient and easy to parametrise. The parametrised SASA solvation method is discussed in the light of protein design and ligand binding studies. The POPS program for the SASA computation on single molecules and complex interfaces is described in detail. Explicit solvent behaviour is described here in the form of solvent density maps at the protein surface. We highlight the usefulness of that approach in defining the organisation of specific water molecules at functional sites and in determining hydrophobicity scores for the identification of potential interaction patches.
Collapse
|
39
|
Polydorides S, Amara N, Aubard C, Plateau P, Simonson T, Archontis G. Computational protein design with a generalized Born solvent model: application to Asparaginyl-tRNA synthetase. Proteins 2011; 79:3448-68. [PMID: 21563215 DOI: 10.1002/prot.23042] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2010] [Revised: 02/25/2011] [Accepted: 03/03/2011] [Indexed: 12/13/2022]
Abstract
Computational Protein Design (CPD) is a promising method for high throughput protein and ligand mutagenesis. Recently, we developed a CPD method that used a polar-hydrogen energy function for protein interactions and a Coulomb/Accessible Surface Area (CASA) model for solvent effects. We applied this method to engineer aspartyl-adenylate (AspAMP) specificity into Asparaginyl-tRNA synthetase (AsnRS), whose substrate is asparaginyl-adenylate (AsnAMP). Here, we implement a more accurate function, with an all-atom energy for protein interactions and a residue-pairwise generalized Born model for solvent effects. As a first test, we compute aminoacid affinities for several point mutants of Aspartyl-tRNA synthetase (AspRS) and Tyrosyl-tRNA synthetase and stability changes for three helical peptides and compare with experiment. As a second test, we readdress the problem of AsnRS aminoacid engineering. We compare three design criteria, which optimize the folding free-energy, the absolute AspAMP affinity, and the relative (AspAMP-AsnAMP) affinity. The sequences and conformations are improved with respect to our previous, polar-hydrogen/CASA study: For several designed complexes, the AspAMP carboxylate forms three interactions with a conserved arginine and a designed lysine, as in the active site of the AspRS:AspAMP complex. The conformations and interactions are well maintained in molecular dynamics simulations and the sequences have an inverted specificity, favoring AspAMP over AsnAMP. The method is not fully successful, since experimental measurements with the seven most promising sequences show that they do not catalyze at a detectable level the adenylation of Asp (or Asn) with ATP. This may be due to weak AspAMP binding and/or disruption of transition-state stabilization.
Collapse
|
40
|
Allison JR, Boguslawski K, Fraternali F, van Gunsteren WF. A Refined, Efficient Mean Solvation Force Model that Includes the Interior Volume Contribution. J Phys Chem B 2011; 115:4547-57. [DOI: 10.1021/jp2017117] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jane R. Allison
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology ETH, 8093 Zürich, Switzerland
| | - Katharina Boguslawski
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology ETH, 8093 Zürich, Switzerland
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King’s College London, London SE1 1UL, United Kingdom
| | - Wilfred F. van Gunsteren
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology ETH, 8093 Zürich, Switzerland
| |
Collapse
|
41
|
Larsson P, Lindahl E. A high-performance parallel-generalized Born implementation enabled by tabulated interaction rescaling. J Comput Chem 2010; 31:2593-600. [PMID: 20740558 DOI: 10.1002/jcc.21552] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Implicit solvent representations, in general, and generalized Born models, in particular, provide an attractive way to reduce the number of interactions and degrees of freedom in a system. The instantaneous relaxation of the dielectric shielding provided by an implicit solvent model can be extremely efficient for high-throughput and Monte Carlo studies, and a reduced system size can also remove a lot of statistical noise. Despite these advantages, it has been difficult for generalized Born implementations to significantly outperform optimized explicit-water simulations due to more complex functional forms and the two extra interaction stages necessary to calculate Born radii and the derivative chain rule terms contributing to the force. Here, we present a method that uses a rescaling transformation to make the standard generalized Born expression a function of a single variable, which enables an efficient tabulated implementation on any modern CPU hardware. The total performance is within a factor 2 of simulations in vacuo. The algorithm has been implemented in Gromacs, including single-instruction multiple-data acceleration, for three different Born radius models and corresponding chain rule terms. We have also adapted the model to work with the virtual interaction sites commonly used for hydrogens to enable long-time steps, which makes it possible to achieve a simulation performance of 0.86 micros/day for BBA5 with 1-nm cutoff on a single quad-core desktop processor. Finally, we have also implemented a set of streaming kernels without neighborlists to accelerate the non-cutoff setup occasionally used for implicit solvent simulations of small systems.
Collapse
Affiliation(s)
- Per Larsson
- Center for Biomembrane Research, Department of Biochemistry & Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden
| | | |
Collapse
|
42
|
Launay G, Simonson T. A large decoy set of protein-protein complexes produced by flexible docking. J Comput Chem 2010; 32:106-20. [DOI: 10.1002/jcc.21604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
43
|
Aleksandrov A, Polydorides S, Archontis G, Simonson T. Predicting the Acid/Base Behavior of Proteins: A Constant-pH Monte Carlo Approach with Generalized Born Solvent. J Phys Chem B 2010; 114:10634-48. [DOI: 10.1021/jp104406x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alexey Aleksandrov
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Savvas Polydorides
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Georgios Archontis
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France, and Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus
| |
Collapse
|
44
|
Lopes A, Schmidt Am Busch M, Simonson T. Computational design of protein-ligand binding: modifying the specificity of asparaginyl-tRNA synthetase. J Comput Chem 2010; 31:1273-86. [PMID: 19862811 DOI: 10.1002/jcc.21414] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
A method for computational design of protein-ligand interactions is implemented and tested on the asparaginyl- and aspartyl-tRNA synthetase enzymes (AsnRS, AspRS). The substrate specificity of these enzymes is crucial for the accurate translation of the genetic code. The method relies on a molecular mechanics energy function and a simple, continuum electrostatic, implicit solvent model. As test calculations, we first compute AspRS-substrate binding free energy changes due to nine point mutations, for which experimental data are available; we also perform large-scale redesign of the entire active site of each enzyme (40 amino acids) and compare to experimental sequences. We then apply the method to engineer an increased binding of aspartyl-adenylate (AspAMP) into AsnRS. Mutants are obtained using several directed evolution protocols, where four or five amino acid positions in the active site are randomized. Promising mutants are subjected to molecular dynamics simulations; Poisson-Boltzmann calculations provide an estimate of the corresponding, AspAMP, binding free energy changes, relative to the native AsnRS. Several of the mutants are predicted to have an inverted binding specificity, preferring to bind AspAMP rather than the natural substrate, AsnAMP. The computed binding affinities are significantly weaker than the native, AsnRS:AsnAMP affinity, and in most cases, the active site structure is significantly changed, compared to the native complex. This almost certainly precludes catalytic activity. One of the designed sequences has a higher affinity and more native-like structure and may represent a valid candidate for Asp activity.
Collapse
Affiliation(s)
- Anne Lopes
- Laboratoire de Biochimie, Department of Biology, UMR CNRS 7654, Ecole Polytechnique, 91128 Palaiseau, France
| | | | | |
Collapse
|
45
|
Juneja A, Numata J, Nilsson L, Knapp EW. Merging Implicit with Explicit Solvent Simulations: Polyethylene Glycol. J Chem Theory Comput 2010; 6:1871-83. [DOI: 10.1021/ct100075m] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alok Juneja
- Freie Universität Berlin, Institute of Chemistry & Biochemistry, Fabeckstr. 36a, D-14195 Berlin, Germany and Centre for Biosciences, Department of Biosciences and Nutrition, Karolinska Institutet, SE-141 83 Huddinge, Sweden
| | - Jorge Numata
- Freie Universität Berlin, Institute of Chemistry & Biochemistry, Fabeckstr. 36a, D-14195 Berlin, Germany and Centre for Biosciences, Department of Biosciences and Nutrition, Karolinska Institutet, SE-141 83 Huddinge, Sweden
| | - Lennart Nilsson
- Freie Universität Berlin, Institute of Chemistry & Biochemistry, Fabeckstr. 36a, D-14195 Berlin, Germany and Centre for Biosciences, Department of Biosciences and Nutrition, Karolinska Institutet, SE-141 83 Huddinge, Sweden
| | - Ernst Walter Knapp
- Freie Universität Berlin, Institute of Chemistry & Biochemistry, Fabeckstr. 36a, D-14195 Berlin, Germany and Centre for Biosciences, Department of Biosciences and Nutrition, Karolinska Institutet, SE-141 83 Huddinge, Sweden
| |
Collapse
|
46
|
Schmidt am Busch M, Sedano A, Simonson T. Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition. PLoS One 2010; 5:e10410. [PMID: 20463972 PMCID: PMC2864755 DOI: 10.1371/journal.pone.0010410] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 03/31/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. METHODOLOGY/PRINCIPAL FINDINGS WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. CONCLUSIONS/SIGNIFICANCE For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.
Collapse
Affiliation(s)
- Marcel Schmidt am Busch
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Audrey Sedano
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
47
|
am Busch MS, Mignon D, Simonson T. Computational protein design as a tool for fold recognition. Proteins 2009; 77:139-58. [PMID: 19408297 DOI: 10.1002/prot.22426] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Computationally designed protein sequences have been proposed as a basis to perform fold recognition and homology searching. To investigate this possibility, an automated procedure is used to completely redesign 24 SH3 proteins and 22 SH2 proteins. We use the experimental backbone coordinates as fixed templates in the folded state and a molecular mechanics model to compute the pairwise interaction energies between all sidechain types and conformations. Energy calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is then used to scan the sequence and conformational space for optimal solutions. We produced 200,000-450,000 sequences for each backbone template. The designed sequences ressemble moderately-distant, natural homologues of the initial templates, according to their identity scores and their similarity with respect to the Pfam sets of SH2 and SH3 domains. Standard homology detection tools document their native-like character: the Conserved Domain Database recognizes 61% (52%) of our low-energy sequences as SH3 (SH2) domains; the SUPERFAMILY, Hidden-Markov Model library recognizes 81% (84%). Conversely, position specific scoring matrices (PSSMs) derived from our designed sequences can be used to detect natural homologues in sequence databases. Within SwissProt, a set of natural SH3 PSSMs detects 772 SH3 domains, for example; our designed PSSMs detect 67% of these, plus one additional sequence and two false positives. If six amino acids involved in substrate binding (a selective pressure not accounted for in our design) are reset to their experimental types, then 77% of the experimental SH3 domains are detected. Results for the SH2 domains are similar. Several directions to improve the method further are discussed.
Collapse
Affiliation(s)
- Marcel Schmidt am Busch
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | | | | |
Collapse
|
48
|
Suárez M, Tortosa P, Jaramillo A. PROTDES: CHARMM toolbox for computational protein design. SYSTEMS AND SYNTHETIC BIOLOGY 2009; 2:105-13. [PMID: 19572216 PMCID: PMC2735645 DOI: 10.1007/s11693-009-9026-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Revised: 05/17/2009] [Accepted: 05/30/2009] [Indexed: 12/13/2022]
Abstract
We present an open-source software able to automatically mutate any residue positions and find the best aminoacids in an arbitrary protein structure without requiring pairwise approximations. Our software, PROTDES, is based on CHARMM and it searches automatically for mutations optimizing a protein folding free energy. PROTDES allows the integration of molecular dynamics within the protein design. We have implemented an heuristic optimization algorithm that iteratively searches the best aminoacids and their conformations for an arbitrary set of positions within a structure. Our software allows CHARMM users to perform protein design calculations and to create their own procedures for protein design using their own energy functions. We show this by implementing three different energy functions based on different solvent treatments: surface area accessibility, generalized Born using molecular volume and an effective energy function. PROTDES, a tutorial, parameter sets, configuration tools and examples are freely available at http://soft.synth-bio.org/protdes.html.
Collapse
Affiliation(s)
- María Suárez
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
- SYNTH-BIO group Epigenomics Project, Genopole Tour Evry2, etage 10, 523, Terrasses de l’Agora, 91034 Evry Cedex, France
| | - Pablo Tortosa
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
| | - Alfonso Jaramillo
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
- SYNTH-BIO group Epigenomics Project, Genopole Tour Evry2, etage 10, 523, Terrasses de l’Agora, 91034 Evry Cedex, France
| |
Collapse
|
49
|
Nkari WK, Prestegard JH. NMR resonance assignments of sparsely labeled proteins: amide proton exchange correlations in native and denatured states. J Am Chem Soc 2009; 131:5344-9. [PMID: 19317468 DOI: 10.1021/ja8100775] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Protein NMR assignments of large proteins using traditional triple resonance techniques depends on double or triple labeling of samples with (15)N, (13)C, and (2)H. This is not always practical with proteins that require expression in nonbacterial hosts. Labeling with isotopically labeled versions of single amino acids (sparse labeling) often is possible; however, resonance assignment then requires a new strategy. Here a procedure for the assignment of cross-peaks in (15)N-(1)H correlation spectra of sparsely labeled proteins is presented. It relies on the correlation of proton-deuterium amide exchange rates in native and denatured spectra of the intact protein, followed by correlation of chemical shifts in the spectra of the denatured protein with chemical shifts of sequenced peptides derived from the protein. The procedure is successfully demonstrated on a sample of a protein, Galectin-3, selectively labeled with (15)N at all alanine residues.
Collapse
Affiliation(s)
- Wendy K Nkari
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, USA
| | | |
Collapse
|
50
|
Lu M, Dousis AD, Ma J. OPUS-Rota: a fast and accurate method for side-chain modeling. Protein Sci 2008; 17:1576-85. [PMID: 18556476 DOI: 10.1110/ps.035022.108] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
In this paper, we introduce a fast and accurate side-chain modeling method, named OPUS-Rota. In a benchmark comparison with the methods SCWRL, NCN, LGA, SPRUCE, Rosetta, and SCAP, OPUS-Rota is shown to be much faster than all the methods except SCWRL, which is comparably fast. In terms of overall chi (1) and chi (1+2) accuracies, however, OPUS-Rota is 5.4 and 8.8 percentage points better, respectively, than SCWRL. Compared with NCN, which has the best accuracy in the literature, OPUS-Rota is 1.6 percentage points better for overall chi (1+2) but 0.3 percentage points weaker for overall chi (1). Hence, our algorithm is much more accurate than SCWRL with similar execution speed, and it has accuracy comparable to or better than the most accurate methods in the literature, but with a runtime that is one or two orders of magnitude shorter. In addition, OPUS-Rota consistently outperforms SCWRL on the Wallner and Elofsson homology-modeling benchmark set when the sequence identity is greater than 40%. We hope that OPUS-Rota will contribute to high-accuracy structure refinement, and the computer program is freely available for academic users.
Collapse
Affiliation(s)
- Mingyang Lu
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | |
Collapse
|