1
|
Tolokh IS, Folescu DE, Onufriev AV. Inclusion of Water Multipoles into the Implicit Solvation Framework Leads to Accuracy Gains. J Phys Chem B 2024; 128:5855-5873. [PMID: 38860842 PMCID: PMC11194828 DOI: 10.1021/acs.jpcb.4c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 05/28/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024]
Abstract
The current practical "workhorses" of the atomistic implicit solvation─the Poisson-Boltzmann (PB) and generalized Born (GB) models─face fundamental accuracy limitations. Here, we propose a computationally efficient implicit solvation framework, the Implicit Water Multipole GB (IWM-GB) model, that systematically incorporates the effects of multipole moments of water molecules in the first hydration shell of a solute, beyond the dipole water polarization already present at the PB/GB level. The framework explicitly accounts for coupling between polar and nonpolar contributions to the total solvation energy, which is missing from many implicit solvation models. An implementation of the framework, utilizing the GAFF force field and AM1-BCC atomic partial charges model, is parametrized and tested against the experimental hydration free energies of small molecules from the FreeSolv database. The resulting accuracy on the test set (RMSE ∼ 0.9 kcal/mol) is 12% better than that of the explicit solvation (TIP3P) treatment, which is orders of magnitude slower. We also find that the coupling between polar and nonpolar parts of the solvation free energy is essential to ensuring that several features of the IWM-GB model are physically meaningful, including the sign of the nonpolar contributions.
Collapse
Affiliation(s)
- Igor S. Tolokh
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E. Folescu
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V. Onufriev
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center
for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
2
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
3
|
Setiadi J, Boothroyd S, Slochower DR, Dotson DL, Thompson MW, Wagner JR, Wang LP, Gilson MK. Tuning Potential Functions to Host-Guest Binding Data. J Chem Theory Comput 2024; 20:239-252. [PMID: 38147689 PMCID: PMC10838530 DOI: 10.1021/acs.jctc.3c01050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Software to more rapidly and accurately predict protein-ligand binding affinities is of high interest for early-stage drug discovery, and physics-based methods are among the most widely used technologies for this purpose. The accuracy of these methods depends critically on the accuracy of the potential functions that they use. Potential functions are typically trained against a combination of quantum chemical and experimental data. However, although binding affinities are among the most important quantities to predict, experimental binding affinities have not to date been integrated into the experimental data set used to train potential functions. In recent years, the use of host-guest complexes as simple and tractable models of binding thermodynamics has gained popularity due to their small size and simplicity, relative to protein-ligand systems. Host-guest complexes can also avoid ambiguities that arise in protein-ligand systems such as uncertain protonation states. Thus, experimental host-guest binding data are an appealing additional data type to integrate into the experimental data set used to optimize potential functions. Here, we report the extension of the Open Force Field Evaluator framework to enable the systematic calculation of host-guest binding free energies and their gradients with respect to force field parameters, coupled with the curation of 126 host-guest complexes with available experimental binding free energies. As an initial application of this novel infrastructure, we optimized generalized Born (GB) cavity radii for the OBC2 GB implicit solvent model against experimental data for 36 host-guest systems. This refitting led to a dramatic improvement in accuracy for both the training set and a separate test set with 90 additional host-guest systems. The optimized radii also showed encouraging transferability from host-guest systems to 59 protein-ligand systems. However, the new radii are significantly smaller than the baseline radii and lead to excessively favorable hydration free energies (HFEs). Thus, users of the OBC2 GB model currently may choose between GB cavity radii that yield more accurate binding affinities and GB cavity radii that yield more accurate HFEs. We suspect that achieving good accuracy on both will require more far-reaching adjustments to the GB model. We note that binding free-energy calculations using the OBC2 model in OpenMM gain about a 10× speedup relative to corresponding explicit solvent calculations, suggesting a future role for implicit solvent absolute binding free-energy (ABFE) calculations in virtual compound screening. This study proves the principle of using host-guest systems to train potential functions that are transferrable to protein-ligand systems and provides an infrastructure that enables a range of applications.
Collapse
Affiliation(s)
- Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| | - Simon Boothroyd
- Boothroyd Scientific Consulting Ltd., London WC2H 9JQ, U.K
- Psivant Therapeutics, Boston, Massachusetts 02210, United States
| | | | - David L Dotson
- Datryllic LLC, Phoenix, Arizona 85003, United States
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Matthew W Thompson
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Jeffrey R Wagner
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Lee-Ping Wang
- Chemistry Department, University of California Davis, Davis, California 95616, United States
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| |
Collapse
|
4
|
Parkin D, Takano M. The Intrinsic Radius as a Key Parameter in the Generalized Born Model to Adjust Protein-Protein Electrostatic Interaction. Int J Mol Sci 2023; 24:ijms24054700. [PMID: 36902130 PMCID: PMC10003186 DOI: 10.3390/ijms24054700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 01/30/2023] [Accepted: 02/19/2023] [Indexed: 03/05/2023] Open
Abstract
The generalized Born (GB) model is an extension of the continuum dielectric theory of Born solvation energy and is a powerful method for accelerating the molecular dynamic (MD) simulations of charged biological molecules in water. While the effective dielectric constant of water that varies as a function of the separation distance between solute molecules is incorporated into the GB model, adjustment of the parameters is indispensable for accurate calculation of the Coulomb (electrostatic) energy. One of the key parameters is the lower limit of the spatial integral of the energy density of the electric field around a charged atom, known as the intrinsic radius ρ. Although ad hoc adjustment of ρ has been conducted to improve the Coulombic (ionic) bond stability, the physical mechanism by which ρ affects the Coulomb energy remains unclear. Via energetic analysis of three differently sized systems, here, we clarify that the Coulomb bond stability increases with increasing ρ and that the increased stability is caused by the interaction energy term, not by the self-energy (desolvation energy) term, as was supposed previously. Our results suggest that the use of larger values for the intrinsic radii of hydrogen and oxygen atoms, together with the use of a relatively small value for the spatial integration cutoff in the GB model, can better reproduce the Coulombic attraction between protein molecules.
Collapse
Affiliation(s)
- Dan Parkin
- Research Institute for Science and Engineering, Waseda University, Okubo 3-4-1, Sinjuku-ku, Tokyo 169-8555, Japan
| | - Mitsunori Takano
- Research Institute for Science and Engineering, Waseda University, Okubo 3-4-1, Sinjuku-ku, Tokyo 169-8555, Japan
- Department of Pure and Applied Physics, Waseda University, Okubo 3-4-1, Sinjuku-ku, Tokyo 169-8555, Japan
- Correspondence: ; Tel.: +81-3-5286-3512
| |
Collapse
|
5
|
Kunche L, Natarajan U. Conformations and Solvation of Synthetic Polymers in Water by Generalized Born Implicit-Solvent Molecular Dynamics Simulations: Stereoisomers of Poly(acrylic acid) and Poly(methacrylic acid). J Phys Chem B 2023; 127:1244-1253. [PMID: 36705523 DOI: 10.1021/acs.jpcb.2c06658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We present the GB-OBC model as an approach for implicit-solvent MD simulations of a synthetic macromolecule in water. The model is tested and found to be successful in reproducing the chain dimensions and predicting the free energy of solvation of carboxylic acid vinyl polymers. The influence of stereochemistry and the hydrophobic nature of the polymer was investigated as a function of chain length (20 < N < 600) for poly(acrylic acid) (PAA) and poly(methacrylic acid) (PMA). The dimensionless parameters of the GB-OBC model were parameterized to be applicable to PAA and PMA. Scaling relations for chain dimensions obtained using implicit-solvent MD simulations in this study are in good agreement with those from experiments, theory of solvated chains in good solvents, and all-atom MD simulations in explicit water. Results show that ⟨Rg2⟩/NL2 is greater for the atactic chain as compared to the isotactic chain, for PAA as well as PMA. ⟨Rg2⟩/NL2 values of chains attain constancy in water for N = 200, with the values being greater for PMA. The PMA chain is conformationally more perturbed than the PAA chain, for both isotactic and atactic stereochemistry. The solvation free energy ΔGhyd of PAA and PMA in water is negative for all chain lengths (N = 20-600) and becomes more favorable with an increase in molecular weight. The ΔGhyd values for isotactic and atactic chains are identical at lower values of N but differ slightly for N > 300. Irrespective of the hydrophobic nature of the polymer, the atactic chain is thermodynamically more soluble in water as compared to the isotactic chain. The isotactic chain is less hydrophilic as compared to the atactic chain due to the closer proximity of the COOH groups along the backbone. This implicit solvent method is an effective way to accurately simulate the configurational properties and solvation of synthetic polymers in water.
Collapse
Affiliation(s)
- Lakshmikumar Kunche
- Macromolecular Modeling and Simulation Lab, Department of Chemical Engineering, Indian Institute of Technology (IIT) Madras, Chennai600036, India
| | - Upendra Natarajan
- Macromolecular Modeling and Simulation Lab, Department of Chemical Engineering, Indian Institute of Technology (IIT) Madras, Chennai600036, India
| |
Collapse
|
6
|
Michael E, Polydorides S, Archontis G. Computational Design of Peptides with Improved Recognition of the Focal Adhesion Kinase FAT Domain. Methods Mol Biol 2022; 2405:383-402. [PMID: 35298823 DOI: 10.1007/978-1-0716-1855-4_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
We describe a two-stage computational protein design (CPD) methodology for the design of peptides binding to the FAT domain of the protein focal adhesion kinase. The first stage involves high-throughput CPD calculations with the Proteus software. The energies of the folded state are described by a physics-based energy function and of the unfolded peptides by a knowledge-based model that reproduces aminoacid compositions consistent with a helicity scale. The obtained sequences are filtered in terms of the affinity and the stability of the complex. In the second stage, design sequences are further evaluated by all-atom molecular dynamics simulations and binding free energy calculations with a molecular mechanics/implicit solvent free energy function.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, Nicosia, Cyprus
| | | | | |
Collapse
|
7
|
Polydorides S, Archontis G. Computational optimization of the SARS-CoV-2 receptor-binding-motif affinity for human ACE2. Biophys J 2021; 120:2859-2871. [PMID: 33984310 PMCID: PMC8110322 DOI: 10.1016/j.bpj.2021.02.049] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 01/15/2023] Open
Abstract
The coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is responsible for the coronavirus disease 2019 pandemic, and the closely related SARS-CoV coronavirus enter cells by binding at the human angiotensin converting enzyme 2 (hACE2). The stronger hACE2 affinity of SARS-CoV-2 has been connected with its higher infectivity. In this work, we study hACE2 complexes with the receptor-binding domains (RBDs) of the human SARS-CoV-2 and human SARS-CoV viruses, using all-atom molecular dynamics simulations and computational protein design with a physics-based energy function. The molecular dynamics simulations identify charge-modifying substitutions between the CoV-2 and CoV RBDs, which either increase or decrease the hACE2 affinity of the SARS-CoV-2 RBD. The combined effect of these mutations is small, and the relative affinity is mainly determined by substitutions at residues in contact with hACE2. Many of these findings are in line and interpret recent experiments. Our computational protein design calculations redesign positions 455, 493, 494, and 501 of the SARS-CoV-2 receptor binding motif, which contact hACE2 in the complex and are important for ACE2 recognition. Sampling is enhanced by an adaptive importance sampling Monte Carlo method. Sequences with increased affinity replace CoV-2 glutamine by a negative residue at position 493; serine by a nonpolar or aromatic residue or an asparagine at position 494; and asparagine by valine or threonine at position 501. Substitutions at positions 455 and 501 have a smaller effect on affinity. Substitutions suggested by our design are seen in viral sequences encountered in other species, including bat and pangolin. Our results might be used to identify potential virus strains with higher human infectivity and assist in the design of peptide-based or peptidomimetic compounds with the potential to inhibit SARS-CoV-2 binding at hACE2.
Collapse
|
8
|
Forouzesh N, Mishra N. An Effective MM/GBSA Protocol for Absolute Binding Free Energy Calculations: A Case Study on SARS-CoV-2 Spike Protein and the Human ACE2 Receptor. Molecules 2021; 26:2383. [PMID: 33923909 PMCID: PMC8074138 DOI: 10.3390/molecules26082383] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/09/2021] [Accepted: 04/13/2021] [Indexed: 12/23/2022] Open
Abstract
The binding free energy calculation of protein-ligand complexes is necessary for research into virus-host interactions and the relevant applications in drug discovery. However, many current computational methods of such calculations are either inefficient or inaccurate in practice. Utilizing implicit solvent models in the molecular mechanics generalized Born surface area (MM/GBSA) framework allows for efficient calculations without significant loss of accuracy. Here, GBNSR6, a new flavor of the generalized Born model, is employed in the MM/GBSA framework for measuring the binding affinity between SARS-CoV-2 spike protein and the human ACE2 receptor. A computational protocol is developed based on the widely studied Ras-Raf complex, which has similar binding free energy to SARS-CoV-2/ACE2. Two options for representing the dielectric boundary of the complexes are evaluated: one based on the standard Bondi radii and the other based on a newly developed set of atomic radii (OPT1), optimized specifically for protein-ligand binding. Predictions based on the two radii sets provide upper and lower bounds on the experimental references: -14.7(ΔGbindBondi)<-10.6(ΔGbindExp.)<-4.1(ΔGbindOPT1) kcal/mol. The consensus estimates of the two bounds show quantitative agreement with the experiment values. This work also presents a novel truncation method and computational strategies for efficient entropy calculations with normal mode analysis. Interestingly, it is observed that a significant decrease in the number of snapshots does not affect the accuracy of entropy calculation, while it does lower computation time appreciably. The proposed MM/GBSA protocol can be used to study the binding mechanism of new variants of SARS-CoV-2, as well as other relevant structures.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, CA 90032, USA
| | - Nikita Mishra
- Department of Chemistry and Biochemistry, California State University, Los Angeles, CA 90032, USA;
| |
Collapse
|
9
|
Corrigan RA, Qi G, Thiel AC, Lynn JR, Walker BD, Casavant TL, Lagardere L, Piquemal JP, Ponder JW, Ren P, Schnieders MJ. Implicit Solvents for the Polarizable Atomic Multipole AMOEBA Force Field. J Chem Theory Comput 2021; 17:2323-2341. [PMID: 33769814 DOI: 10.1021/acs.jctc.0c01286] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Computational protein design, ab initio protein/RNA folding, and protein-ligand screening can be too computationally demanding for explicit treatment of solvent. For these applications, implicit solvent offers a compelling alternative, which we describe here for the polarizable atomic multipole AMOEBA force field based on three treatments of continuum electrostatics: numerical solutions to the nonlinear and linearized versions of the Poisson-Boltzmann equation (PBE), the domain-decomposition conductor-like screening model (ddCOSMO) approximation to the PBE, and the analytic generalized Kirkwood (GK) approximation. The continuum electrostatics models are combined with a nonpolar estimator based on novel cavitation and dispersion terms. Electrostatic model parameters are numerically optimized using a least-squares style target function based on a library of 103 small-molecule solvation free energy differences. Mean signed errors for the adaptive Poisson-Boltzmann solver (APBS), ddCOSMO, and GK models are 0.05, 0.00, and 0.00 kcal/mol, respectively, while the mean unsigned errors are 0.70, 0.63, and 0.58 kcal/mol, respectively. Validation of the electrostatic response of the resulting implicit solvents, which are available in the Tinker (or Tinker-HP), OpenMM, and Force Field X software packages, is based on comparisons to explicit solvent simulations for a series of proteins and nucleic acids. Overall, the emergence of performative implicit solvent models for polarizable force fields opens the door to their use for folding and design applications.
Collapse
Affiliation(s)
- Rae A Corrigan
- Roy J Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Guowei Qi
- Department of Biochemistry, University of Iowa, Iowa City, Iowa 52242, United States
| | - Andrew C Thiel
- Roy J Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Jack R Lynn
- Roy J Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Brandon D Walker
- Department of Biomedical Engineering, University of Texas in Austin, Austin, Texas 78712, United States
| | - Thomas L Casavant
- Roy J Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Louis Lagardere
- Department of Chemistry, Sorbonne Université, F-75005 Paris, France
| | | | - Jay W Ponder
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Pengyu Ren
- Department of Biomedical Engineering, University of Texas in Austin, Austin, Texas 78712, United States
| | - Michael J Schnieders
- Roy J Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, United States.,Department of Biochemistry, University of Iowa, Iowa City, Iowa 52242, United States
| |
Collapse
|
10
|
Mahmoud SSM, Esposito G, Serra G, Fogolari F. Generalized Born radii computation using linear models and neural networks. Bioinformatics 2020; 36:1757-1764. [PMID: 31693089 DOI: 10.1093/bioinformatics/btz818] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/08/2019] [Accepted: 10/30/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Implicit solvent models play an important role in describing the thermodynamics and the dynamics of biomolecular systems. Key to an efficient use of these models is the computation of generalized Born (GB) radii, which is accomplished by algorithms based on the electrostatics of inhomogeneous dielectric media. The speed and accuracy of such computations are still an issue especially for their intensive use in classical molecular dynamics. Here, we propose an alternative approach that encodes the physics of the phenomena and the chemical structure of the molecules in model parameters which are learned from examples. RESULTS GB radii have been computed using (i) a linear model and (ii) a neural network. The input is the element, the histogram of counts of neighbouring atoms, divided by atom element, within 16 Å. Linear models are ca. 8 times faster than the most widely used reference method and the accuracy is higher with correlation coefficient with the inverse of 'perfect' GB radii of 0.94 versus 0.80 of the reference method. Neural networks further improve the accuracy of the predictions with correlation coefficient with 'perfect' GB radii of 0.97 and ca. 20% smaller root mean square error. AVAILABILITY AND IMPLEMENTATION We provide a C program implementing the computation using the linear model, including the coefficients appropriate for the set of Bondi radii, as Supplementary Material. We also provide a Python implementation of the neural network model with parameter and example files in the Supplementary Material as well. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Saida Saad Mohamed Mahmoud
- Department of Mathematics, Informatics and Physics, University of Udine, Udine 33100, Italy.,Faculty of Science, Cairo University, Giza 12613, Egypt
| | - Gennaro Esposito
- Science and Math Division, New York University at Abu Dhabi, PO Box 129188, Abu Dhabi, United Arab Emirates
| | - Giuseppe Serra
- Department of Mathematics, Informatics and Physics, University of Udine, Udine 33100, Italy
| | - Federico Fogolari
- Department of Mathematics, Informatics and Physics, University of Udine, Udine 33100, Italy
| |
Collapse
|
11
|
Forouzesh N, Onufriev AV. MMGB/SA Consensus Estimate of the Binding Free Energy Between the Novel Coronavirus Spike Protein to the Human ACE2 Receptor. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.08.25.267625. [PMID: 32869029 PMCID: PMC7457614 DOI: 10.1101/2020.08.25.267625] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The ability to estimate protein-protein binding free energy in a computationally efficient via a physics-based approach is beneficial to research focused on the mechanism of viruses binding to their target proteins. Implicit solvation methodology may be particularly useful in the early stages of such research, as it can offer valuable insights into the binding process, quickly. Here we evaluate the potential of the related molecular mechanics generalized Born surface area (MMGB/SA) approach to estimate the binding free energy ΔGbind between the SARS-CoV-2 spike receptor-binding domain and the human ACE2 receptor. The calculations are based on a recent flavor of the generalized Born model, GBNSR6. Two estimates of ΔGbind are performed: one based on standard bondi radii, and the other based on a newly developed set of atomic radii (OPT1), optimized specifically for protein-ligand binding. We take the average of the resulting two ΔGbind values as the consensus estimate. For the well-studied Ras-Raf protein-protein complex, which has similar binding free energy to that of the SARS-CoV-2/ACE2 complex, the consensus ΔGbind = -11.8 ± 1 kcal/mol, vs. experimental -9.7 ± 0.2 kcal/mol. The consensus estimates for the SARS-CoV-2/ACE2 complex is ΔGbind = -9.4 ± 1.5 kcal/mol, which is in near quantitative agreement with experiment (-10.6 kcal/mol). The availability of a conceptually simple MMGB/SA-based protocol for analysis of the SARS-CoV-2 /ACE2 binding may be beneficial in light of the need to move forward fast.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, Los Angeles, CA 90032, USA
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
| |
Collapse
|
12
|
Michael E, Polydorides S, Promponas VJ, Skourides P, Archontis G. Recognition of LD motifs by the focal adhesion targeting domains of focal adhesion kinase and proline-rich tyrosine kinase 2-beta: Insights from molecular dynamics simulations. Proteins 2020; 89:29-52. [PMID: 32776636 DOI: 10.1002/prot.25992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/21/2020] [Accepted: 07/26/2020] [Indexed: 12/13/2022]
Abstract
The focal adhesion kinase (FAK) and the proline-rich tyrosine kinase 2-beta (PYK2) are implicated in cancer progression and metastasis and represent promising biomarkers and targets for cancer therapy. FAK and PYK2 are recruited to focal adhesions (FAs) via interactions between their FA targeting (FAT) domains and conserved segments (LD motifs) on the proteins Paxillin, Leupaxin, and Hic-5. A promising new approach for the inhibition of FAK and PYK2 targets interactions of the FAK domains with proteins that promote localization at FAs. Advances toward this goal include the development of surface plasmon resonance, heteronuclear single quantum coherence nuclear magnetic resonance (HSQC-NMR) and fluorescence polarization assays for the identification of fragments or compounds interfering with the FAK-Paxillin interaction. We have recently validated this strategy, showing that Paxillin mimicking polypeptides with 2 to 3 LD motifs displace FAK from FAs and block kinase-dependent and independent functions of FAK, including downstream integrin signaling and FA localization of the protein p130Cas. In the present work we study by all-atom molecular dynamics simulations the recognition of peptides with the Paxillin and Leupaxin LD motifs by the FAK-FAT and PYK2-FAT domains. Our simulations and free-energy analysis interpret experimental data on binding of Paxillin and Leupaxin LD motifs at FAK-FAT and PYK2-FAT binding sites, and assess the roles of consensus LD regions and flanking residues. Our results can assist in the design of effective inhibitory peptides of the FAK-FAT: Paxillin and PYK2-FAT:Leupaxin complexes and the construction of pharmacophore models for the discovery of potential small-molecule inhibitors of the FAK-FAT and PYK2-FAT focal adhesion based functions.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, Nicosia, Cyprus
| | | | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Paris Skourides
- Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | | |
Collapse
|
13
|
Forouzesh N, Mukhopadhyay A, Watson LT, Onufriev AV. Multidimensional Global Optimization and Robustness Analysis in the Context of Protein-Ligand Binding. J Chem Theory Comput 2020; 16:4669-4684. [PMID: 32450041 PMCID: PMC8594251 DOI: 10.1021/acs.jctc.0c00142] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Accuracy of protein-ligand binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a global multidimensional optimization pipeline is developed to find optimal atomic radii specifically for protein-ligand binding calculations in implicit solvent. The computational pipeline has these three key components: (1) a massively parallel implementation of a deterministic global optimization algorithm (VTDIRECT95), (2) an accurate yet reasonably fast generalized Born implicit solvent model (GBNSR6), and (3) a novel robustness metric that helps distinguish between nearly degenerate local minima via a postprocessing step of the optimization. A graph-based "kT-connectivity" approach to explore and visualize the multidimensional energy landscape is proposed: local minima that can be reached from the global minimum without exceeding a given energy threshold (kT) are considered to be connected. As an illustration of the capabilities of the optimization pipeline, we apply it to find a global optimum in the space of just five radii: four atomic (O, H, N, and C) radii and water probe radius. The optimized radii, ρW = 1.37 Å, ρC = 1.40 Å, ρH = 1.55 Å, ρN = 2.35 Å, and ρO = 1.28 Å, lead to a closer agreement of electrostatic binding free energies with the explicit solvent reference than two commonly used sets of radii previously optimized for small molecules. At the same time, the ability of the optimizer to find the global optimum reveals fundamental limits of the common two-dielectric implicit solvation model: the computed electrostatic binding free energies are still almost 4 kcal/mol away from the explicit solvent reference. The proposed computational approach opens the possibility to further improve the accuracy of practical computational protocols for binding free energy calculations.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Abhishek Mukhopadhyay
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Layne T Watson
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Aerospace and Ocean Engineering, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| |
Collapse
|
14
|
Beker W, Sokalski WA. Bottom-Up Nonempirical Approach To Reducing Search Space in Enzyme Design Guided by Catalytic Fields. J Chem Theory Comput 2020; 16:3420-3429. [PMID: 32282205 PMCID: PMC7467639 DOI: 10.1021/acs.jctc.0c00139] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Currently developed protocols of theozyme design still lead to biocatalysts with much lower catalytic activity than enzymes existing in nature, and, so far, the only avenue of improvement was the in vitro laboratory-directed evolution (LDE) experiments. In this paper, we propose a different strategy based on "reversed" methodology of mutation prediction. Instead of common "top-down" approach, requiring numerous assumptions and vast computational effort, we argue for a "bottom-up" approach that is based on the catalytic fields derived directly from transition state and reactant complex wave functions. This enables direct one-step determination of the general quantitative angular characteristics of optimal catalytic site and simultaneously encompasses both the transition-state stabilization (TSS) and ground-state destabilization (GSD) effects. We further extend the static catalytic field approach by introducing a library of atomic multipoles for amino acid side-chain rotamers, which, together with the catalytic field, allow one to determine the optimal side-chain orientations of charged amino acids constituting the elusive structure of a preorganized catalytic environment. Obtained qualitative agreement with experimental LDE data for Kemp eliminase KE07 mutants validates the proposed procedure, yielding, in addition, a detailed insight into possible dynamic and epistatic effects.
Collapse
Affiliation(s)
- Wiktor Beker
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | | |
Collapse
|
15
|
Wang E, Sun H, Wang J, Wang Z, Liu H, Zhang JZH, Hou T. End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. Chem Rev 2019; 119:9478-9508. [DOI: 10.1021/acs.chemrev.9b00055] [Citation(s) in RCA: 578] [Impact Index Per Article: 115.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Ercheng Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Huiyong Sun
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Zhe Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU−ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai 200122, China
- Department of Chemistry, New York University, New York, New York 10003, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
16
|
Abstract
It would often be useful in computer simulations to use an implicit description of solvation effects, instead of explicitly representing the individual solvent molecules. Continuum dielectric models often work well in describing the thermodynamic aspects of aqueous solvation and can be very efficient compared to the explicit treatment of the solvent. Here, we review a particular class of so-called fast implicit solvent models, generalized Born (GB) models, which are widely used for molecular dynamics (MD) simulations of proteins and nucleic acids. These approaches model hydration effects and provide solvent-dependent forces with efficiencies comparable to molecular-mechanics calculations on the solute alone; as such, they can be incorporated into MD or other conformational searching strategies in a straightforward manner. The foundations of the GB model are reviewed, followed by examples of newer, emerging models and examples of important applications. We discuss their strengths and weaknesses, both for fidelity to the underlying continuum model and for the ability to replace explicit consideration of solvent molecules in macromolecular simulations.
Collapse
Affiliation(s)
- Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24060, USA;
| | - David A Case
- Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA;
| |
Collapse
|
17
|
Siebenmorgen T, Zacharias M. Evaluation of Predicted Protein-Protein Complexes by Binding Free Energy Simulations. J Chem Theory Comput 2019; 15:2071-2086. [PMID: 30698954 DOI: 10.1021/acs.jctc.8b01022] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The accurate prediction of protein-protein complex geometries is of major importance to ultimately model the complete interactome of interacting proteins in a cell. A major bottleneck is the realistic free energy evaluation of predicted docked structures. Typically, simple scoring functions applied to single-complex structures are employed that neglect conformational entropy and often solvent effects completely. The binding free energy of a predicted protein-protein complex can, however, be calculated using umbrella sampling (US) along a predefined dissociation/association coordinate of a complex. We employed atomistic US-molecular dynamics simulations including appropriate conformational and axial restraints and an implicit generalized Born solvent model to calculate binding free energies of a large set of docked decoys for 20 different complexes. Free energies associated with the restraints were calculated separately. In principle, the approach includes all energetic and entropic contributions to the binding process. The evaluation of docked complexes based on binding free energy calculation was in better agreement with experiment compared to a simple scoring based on energy minimization or MD refinement using exactly the same force field description. Even calculated absolute binding free energies of structures close to the native binding geometry showed a reasonable correlation to experiment. However, still for a number of complexes docked decoys of lower free energy than near-native geometries were found indicating inaccuracies in the force field or the implicit solvent model. Although time consuming the approach may open up a new route for realistic ranking of predicted geometries based on calculated free energy of binding.
Collapse
Affiliation(s)
- Till Siebenmorgen
- Physik-Department T38 , Technische Universität München , James-Franck-Strasse 1 , 85748 Garching , Germany
| | - Martin Zacharias
- Physik-Department T38 , Technische Universität München , James-Franck-Strasse 1 , 85748 Garching , Germany
| |
Collapse
|
18
|
Tolokh IS, Thomas DG, Onufriev AV. Explicit ions/implicit water generalized Born model for nucleic acids. J Chem Phys 2018; 148:195101. [PMID: 30307229 DOI: 10.1063/1.5027260] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The ion atmosphere around highly charged nucleic acid molecules plays a significant role in their dynamics, structure, and interactions. Here we utilized the implicit solvent framework to develop a model for the explicit treatment of ions interacting with nucleic acid molecules. The proposed explicit ions/implicit water model is based on a significantly modified generalized Born (GB) model and utilizes a non-standard approach to define the solute/solvent dielectric boundary. Specifically, the model includes modifications to the GB interaction terms for the case of multiple interacting solutes-disconnected dielectric boundary around the solute-ion or ion-ion pairs. A fully analytical description of all energy components for charge-charge interactions is provided. The effectiveness of the approach is demonstrated by calculating the potential of mean force for Na+-Cl- ion pair and by carrying out a set of Monte Carlo (MC) simulations of mono- and trivalent ions interacting with DNA and RNA duplexes. The monovalent (Na+) and trivalent (CoHex3+) counterion distributions predicted by the model are in close quantitative agreement with all-atom explicit water molecular dynamics simulations used as reference. Expressed in the units of energy, the maximum deviations of local ion concentrations from the reference are within k B T. The proposed explicit ions/implicit water GB model is able to resolve subtle features and differences of CoHex distributions around DNA and RNA duplexes. These features include preferential CoHex binding inside the major groove of the RNA duplex, in contrast to CoHex biding at the "external" surface of the sugar-phosphate backbone of the DNA duplex; these differences in the counterion binding patters were earlier shown to be responsible for the observed drastic differences in condensation propensities between short DNA and RNA duplexes. MC simulations of CoHex ions interacting with the homopolymeric poly(dA·dT) DNA duplex with modified (de-methylated) and native thymine bases are used to explore the physics behind CoHex-thymine interactions. The simulations suggest that the ion desolvation penalty due to proximity to the low dielectric volume of the methyl group can contribute significantly to CoHex-thymine interactions. Compared to the steric repulsion between the ion and the methyl group, the desolvation penalty interaction has a longer range and may be important to consider in the context of methylation effects on DNA condensation.
Collapse
Affiliation(s)
- Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Dennis G Thomas
- Computational Biology, Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | - Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, USA
| |
Collapse
|
19
|
Aleksandrov A, Lin FY, Roux B, MacKerell AD. Combining the polarizable Drude force field with a continuum electrostatic Poisson-Boltzmann implicit solvation model. J Comput Chem 2018; 39:1707-1719. [PMID: 29737546 DOI: 10.1002/jcc.25345] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Revised: 02/26/2018] [Accepted: 04/08/2018] [Indexed: 12/13/2022]
Abstract
In this work, we have combined the polarizable force field based on the classical Drude oscillator with a continuum Poisson-Boltzmann/solvent-accessible surface area (PB/SASA) model. In practice, the positions of the Drude particles experiencing the solvent reaction field arising from the fixed charges and induced polarization of the solute must be optimized in a self-consistent manner. Here, we parameterized the model to reproduce experimental solvation free energies of a set of small molecules. The model reproduces well-experimental solvation free energies of 70 molecules, yielding a root mean square difference of 0.8 kcal/mol versus 2.5 kcal/mol for the CHARMM36 additive force field. The polarization work associated with the solute transfer from the gas-phase to the polar solvent, a term neglected in the framework of additive force fields, was found to make a large contribution to the total solvation free energy, comparable to the polar solute-solvent solvation contribution. The Drude PB/SASA also reproduces well the electronic polarization from the explicit solvent simulations of a small protein, BPTI. Model validation was based on comparisons with the experimental relative binding free energies of 371 single alanine mutations. With the Drude PB/SASA model the root mean square deviation between the predicted and experimental relative binding free energies is 3.35 kcal/mol, lower than 5.11 kcal/mol computed with the CHARMM36 additive force field. Overall, the results indicate that the main limitation of the Drude PB/SASA model is the inability of the SASA term to accurately capture non-polar solvation effects. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Alexey Aleksandrov
- Laboratoire d'Optique et Biosciences, CNRS, INSERM, Ecole Polytechnique, Palaiseau F-91128, France
| | - Fang-Yu Lin
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, Maryland 21201
| | - Benoît Roux
- Department of Biochemistry and Molecular Biology, Gordon Center for Integrative Science, 929 E57th Street, University of Chicago, Chicago, Illinois 60637
| | - Alexander D MacKerell
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, Maryland 21201
| |
Collapse
|
20
|
Izadi S, Harris RC, Fenley MO, Onufriev AV. Accuracy Comparison of Generalized Born Models in the Calculation of Electrostatic Binding Free Energies. J Chem Theory Comput 2018; 14:1656-1670. [PMID: 29378399 DOI: 10.1021/acs.jctc.7b00886] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The need for accurate yet efficient representation of the aqueous environment in biomolecular modeling has led to the development of a variety of generalized Born (GB) implicit solvent models. While many studies have focused on the accuracy of available GB models in predicting solvation free energies, a systematic assessment of the quality of these models in binding free energy calculations, crucial for rational drug design, has not been undertaken. Here, we evaluate the accuracies of eight common GB flavors (GB-HCT, GB-OBC, GB-neck2, GBNSR6, GBSW, GBMV1, GBMV2, and GBMV3), available in major molecular dynamics packages, in predicting the electrostatic binding free energies ( ΔΔ Gel) for a diverse set of 60 biomolecular complexes belonging to four main classes: protein-protein, protein-drug, RNA-peptide, and small complexes. The GB flavors are examined in terms of their ability to reproduce the results from the Poisson-Boltzmann (PB) model, commonly used as accuracy reference in this context. We show that the agreement with the PB of ΔΔ Gel estimates varies widely between different GB models and also across different types of biomolecular complexes, with R2 correlations ranging from 0.3772 to 0.9986. A surface-based "R6" GB model recently implemented in AMBER shows the closest overall agreement with reference PB ( R2 = 0.9949, RMSD = 8.75 kcal/mol). The RNA-peptide and protein-drug complex sets appear to be most challenging for all but one model, as indicated by the large deviations from the PB in ΔΔ Gel. Small neutral complexes present the least challenge for most of the GB models tested. The quantitative demonstration of the strengths and weaknesses of the GB models across the diverse complex types provided here can be used as a guide for practical computations and future development efforts.
Collapse
Affiliation(s)
- Saeed Izadi
- Early Stage Pharmaceutical Development , Genentech Inc. , 1 DNA Way , South San Francisco , California 94080 , United States
| | - Robert C Harris
- Department of Pharmaceutical Sciences , University of Maryland School of Pharmacy , Baltimore , Maryland 21201 , United States
| | - Marcia O Fenley
- Institute of Molecular Biophysics , Florida State University , Tallahassee , Florida 32306-3408 , United States
| | | |
Collapse
|
21
|
Vorobjev YN, Scheraga HA, Vila JA. Coupled molecular dynamics and continuum electrostatic method to compute the ionization pKa's of proteins as a function of pH. Test on a large set of proteins. J Biomol Struct Dyn 2018; 36:561-574. [PMID: 28132613 PMCID: PMC6191177 DOI: 10.1080/07391102.2017.1288169] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2016] [Accepted: 01/20/2017] [Indexed: 10/25/2022]
Abstract
A computational method, to predict the pKa values of the ionizable residues Asp, Glu, His, Tyr, and Lys of proteins, is presented here. Calculation of the electrostatic free-energy of the proteins is based on an efficient version of a continuum dielectric electrostatic model. The conformational flexibility of the protein is taken into account by carrying out molecular dynamics simulations of 10 ns in implicit water. The accuracy of the proposed method of calculation of pKa values is estimated from a test set of experimental pKa data for 297 ionizable residues from 34 proteins. The pKa-prediction test shows that, on average, 57, 86, and 95% of all predictions have an error lower than 0.5, 1.0, and 1.5 pKa units, respectively. This work contributes to our general understanding of the importance of protein flexibility for an accurate computation of pKa, providing critical insight about the significance of the multiple neutral states of acid and histidine residues for pKa-prediction, and may spur significant progress in our effort to develop a fast and accurate electrostatic-based method for pKa-predictions of proteins as a function of pH.
Collapse
Affiliation(s)
- Yury N. Vorobjev
- Institute of Chemical Biology and Fundamental Medicine of the Siberian Branch of the Russian Academy of Science, Lavrentiev Avenue 8, Novosibirsk 630090
- Novosibirsk State University, Novosibirsk 630090, Russia
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301
| | - Harold A. Scheraga
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301
| | - Jorge A. Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de Los Andes 950, 5700 San Luis, Argentina
| |
Collapse
|
22
|
Brieg M, Setzler J, Albert S, Wenzel W. Generalized Born implicit solvent models for small molecule hydration free energies. Phys Chem Chem Phys 2018; 19:1677-1685. [PMID: 27995260 DOI: 10.1039/c6cp07347f] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Hydration free energy estimation of small molecules from all-atom simulations was widely investigated in recent years, as it provides an essential test of molecular force fields and our understanding of solvation effects. While explicit solvent representations result in highly accurate models, they also require extensive sampling due to the high number of solvent degrees of freedom. Implicit solvent models, such as those based on the generalized Born model for electrostatic solvation effects and a solvent accessible surface area term for nonpolar contributions (GBSA), significantly reduce the number of degrees of freedom and the computational cost to estimate hydration free energies. However, a recent survey revealed a gap in the accuracy between explicit TIP3P solvent estimates and those computed with many common GBSA models. Here we address this shortcoming by providing a thorough comparison of the performance of three implicit solvent models with different nonpolar contributions and a generalized Born term to estimate experimental hydration free energies. Starting with a minimal set of only ten atom types, we demonstrate that a nonpolar term with atom type dependent surface tension coefficients in combination with an accurate generalized Born term and fully optimized parameters performs best in estimating hydration free energies, even yielding comparable results to the explicit TIP3P water model. Analysis of our results provides evidence that the asymmetric behavior of water around oppositely charged atoms is one of the main sources of error for two of the three implicit solvent models. Explicitly accounting for this effect in the parameterization reduces the corresponding errors, suggesting this as a general strategy for improving implicit solvent models. The findings presented here will help to improve the existing generalized Born based implicit solvent models implemented in state-of-the-art molecular simulation packages.
Collapse
Affiliation(s)
- Martin Brieg
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany and Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany.
| | - Julia Setzler
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany.
| | - Steffen Albert
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany.
| | - Wolfgang Wenzel
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany.
| |
Collapse
|
23
|
Forouzesh N, Izadi S, Onufriev AV. Grid-Based Surface Generalized Born Model for Calculation of Electrostatic Binding Free Energies. J Chem Inf Model 2017; 57:2505-2513. [DOI: 10.1021/acs.jcim.7b00192] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | - Saeed Izadi
- Early Stage Pharmaceutical
Development, Genentech Inc., 1 DNA
Way, South San Francisco, California 94080, United States
| | - Alexey V. Onufriev
- Center
for Soft Matter and Biological Physics, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United States
| |
Collapse
|
24
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
25
|
Michael E, Polydorides S, Simonson T, Archontis G. Simple models for nonpolar solvation: Parameterization and testing. J Comput Chem 2017; 38:2509-2519. [PMID: 28786118 DOI: 10.1002/jcc.24910] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2017] [Revised: 07/19/2017] [Accepted: 07/20/2017] [Indexed: 12/13/2022]
Abstract
Implicit solvent models are important for many biomolecular simulations. The polarity of aqueous solvent is essential and qualitatively captured by continuum electrostatics methods like Generalized Born (GB). However, GB does not account for the solvent-induced interactions between exposed hydrophobic sidechains or solute-solvent dispersion interactions. These "nonpolar" effects are often modeled through surface area (SA) energy terms, which lack realism, create mathematical singularities, and have a many-body character. We have explored an alternate, Lazaridis-Karplus (LK) gaussian energy density for nonpolar effects and a dispersion (DI) energy term proposed earlier, associated with GB electrostatics. We parameterized several combinations of GB, SA, LK, and DI energy terms, to reproduce 62 small molecule solvation free energies, 387 protein stability changes due to point mutations, and the structures of 8 protein loops. With optimized parameters, the models all gave similar results, with GBLK and GBDILK giving no performance loss compared to GBSA, and mean errors of 1.7 kcal/mol for the stability changes and 2 Å deviations for the loop conformations. The optimized GBLK model gave poor results in MD of the Trpcage mini-protein, but parameters optimized specifically for MD performed well for Trpcage and three other small proteins. Overall, the LK and DI nonpolar terms are valid alternatives to SA treatments for a range of applications. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus.,Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, Nicosia, CY1678, Cyprus
| |
Collapse
|
26
|
Setny P, Dudek A. Explicit Solvent Hydration Benchmark for Proteins with Application to the PBSA Method. J Chem Theory Comput 2017; 13:2762-2776. [PMID: 28498675 DOI: 10.1021/acs.jctc.7b00247] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Explicit and implicit solvent models have a proven record of delivering hydration free energies of small, druglike solutes in reasonable agreement with experiment. Hydration of macromolecules, such as proteins, is to a large extent uncharted territory, with few results shedding light on quantitative consistency between different solvent models, let alone their ability to reproduce real water. In this work, based on extensive explicit solvent simulations employing TIP3P and SPC/E water models we analyze hydration free energy changes between fixed conformations of 5 diverse proteins, including large multidomain structures. For the two solvent models we find better agreement in electrostatic rather than nonpolar contributions (RMSE of 2.3 and 2.7 kcal/mol, respectively), even though absolute values of the latter are typically an order of magnitude smaller. We also highlight the importance of finite size corrections to relative protein hydration free energies, which turn out to be rather large, on the order of several kcal/mol, and are necessary for proper interpretation of results obtained under periodic boundary conditions. We further compare gathered data with predictions of the implicit solvent approach based on the Poisson equation and the surface or volume based nonpolar term. We find definitely lesser consistency than between the two explicit models (RMSE between implicit and TIP3 results of 11.3 and 8.4 kcal/mol for electrostatic and nonpolar contributions, respectively). In the process we determine the value of the protein dielectric constant and the geometric model for the dielectric boundary that provide for the best agreement. Finally, we evaluate the usefulness of surface and volume based models of nonpolar contributions to hydration free energy of large biomolecules.
Collapse
Affiliation(s)
- Piotr Setny
- Centre of New Technologies, University of Warsaw , Banacha 2c, 02-097 Warsaw, Poland
| | - Anita Dudek
- Centre of New Technologies, University of Warsaw , Banacha 2c, 02-097 Warsaw, Poland
| |
Collapse
|
27
|
Mizuhara Y, Parkin D, Umezawa K, Ohnuki J, Takano M. Over-Destabilization of Protein-Protein Interaction in Generalized Born Model and Utility of Energy Density Integration Cutoff. J Phys Chem B 2017; 121:4669-4677. [PMID: 28426223 DOI: 10.1021/acs.jpcb.7b01438] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The generalize Born (GB) model is frequently used in MD simulations of biomolecular systems in aqueous solution. The GB model is usually based on the so-called Coulomb field approximation (CFA) for the energy density integration. In this study, we report that the GB model with CFA overdestabilizes the long-range electrostatic attraction between oppositely charged molecules (ionic bond forming two-helix system and kinesin-tubulin system) when the energy density integration cutoff, rmax, which is used to calculate the Born energy, is set to a large value. We show that employing large rmax, which is usually expected to make simulation results more accurate, worsens the accuracy so that the attraction is changed into repulsion. It is demonstrated that the overdestabilization is caused by the overestimation of the desolvation penalty upon binding that originates from CFA. We point out that the overdestabilization can be corrected by employing a relatively small cutoff (rmax = 10-15 Å), affirming that the GB models, even with CFA, can be used as a powerful tool to theoretically study the protein-protein interaction, particularly on its dynamical aspect, such as binding and unbinding.
Collapse
Affiliation(s)
- Yukinobu Mizuhara
- Department of Pure and Applied Physics, Waseda University , Okubo 3-4-1, Shinjuku-Ku, Tokyo 169-8555, Japan
| | - Dan Parkin
- Department of Advanced Science and Engineering, Waseda University , Okubo 3-4-1, Shinjuku-Ku, Tokyo 169-8555, Japan
| | | | - Jun Ohnuki
- Department of Pure and Applied Physics, Waseda University , Okubo 3-4-1, Shinjuku-Ku, Tokyo 169-8555, Japan
| | - Mitsunori Takano
- Department of Pure and Applied Physics, Waseda University , Okubo 3-4-1, Shinjuku-Ku, Tokyo 169-8555, Japan
| |
Collapse
|
28
|
Mignon D, Panel N, Chen X, Fuentes EJ, Simonson T. Computational Design of the Tiam1 PDZ Domain and Its Ligand Binding. J Chem Theory Comput 2017; 13:2271-2289. [PMID: 28394603 DOI: 10.1021/acs.jctc.6b01255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PDZ domains direct protein-protein interactions and serve as models for protein design. Here, we optimized a protein design energy function for the Tiam1 and Cask PDZ domains that combines a molecular mechanics energy, Generalized Born solvent, and an empirical unfolded state model. Designed sequences were recognized as PDZ domains by the Superfamily fold recognition tool and had similarity scores comparable to natural PDZ sequences. The optimized model was used to redesign the two PDZ domains, by gradually varying the chemical potential of hydrophobic amino acids; the tendency of each position to lose or gain a hydrophobic character represents a novel hydrophobicity index. We also redesigned four positions in the Tiam1 PDZ domain involved in peptide binding specificity. The calculated affinity differences between designed variants reproduced experimental data and suggest substitutions with altered specificities.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. & Lucille A. Carver College of Medicine and Holden Comprehensive Cancer Center, University of Iowa , Iowa City, Iowa 52242-1109, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| |
Collapse
|
29
|
Harris RC, Deng N, Levy RM, Ishizuka R, Matubayasi N. Computing conformational free energy differences in explicit solvent: An efficient thermodynamic cycle using an auxiliary potential and a free energy functional constructed from the end points. J Comput Chem 2016; 38:1198-1208. [PMID: 28008630 DOI: 10.1002/jcc.24668] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Revised: 09/02/2016] [Accepted: 10/26/2016] [Indexed: 01/17/2023]
Abstract
Many biomolecules undergo conformational changes associated with allostery or ligand binding. Observing these changes in computer simulations is difficult if their timescales are long. These calculations can be accelerated by observing the transition on an auxiliary free energy surface with a simpler Hamiltonian and connecting this free energy surface to the target free energy surface with free energy calculations. Here, we show that the free energy legs of the cycle can be replaced with energy representation (ER) density functional approximations. We compute: (1) The conformational free energy changes for alanine dipeptide transitioning from the right-handed free energy basin to the left-handed basin and (2) the free energy difference between the open and closed conformations of β-cyclodextrin, a "host" molecule that serves as a model for molecular recognition in host-guest binding. β-cyclodextrin contains 147 atoms compared to 22 atoms for alanine dipeptide, making β-cyclodextrin a large molecule for which to compute solvation free energies by free energy perturbation or integration methods and the largest system for which the ER method has been compared to exact free energy methods. The ER method replaced the 28 simulations to compute each coupling free energy with two endpoint simulations, reducing the computational time for the alanine dipeptide calculation by about 70% and for the β-cyclodextrin by > 95%. The method works even when the distribution of conformations on the auxiliary free energy surface differs substantially from that on the target free energy surface, although some degree of overlap between the two surfaces is required. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Robert C Harris
- Department of Chemistry and Center for Biophysics and Computational Biology and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122
| | - Nanjie Deng
- Department of Chemistry and Center for Biophysics and Computational Biology and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122.,Department of Chemistry and Physical Sciences, Dyson College of Arts and Sciences, Pace University, New York, New York, 10038
| | - Ronald M Levy
- Department of Chemistry and Center for Biophysics and Computational Biology and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania, 19122
| | - Ryosuke Ishizuka
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, 560-8531, Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, 560-8531, Japan.,Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto, 615-8520, Japan
| |
Collapse
|
30
|
Deng N, Zhang BW, Levy RM. Connecting free energy surfaces in implicit and explicit solvent: an efficient method to compute conformational and solvation free energies. J Chem Theory Comput 2016; 11:2868-78. [PMID: 26236174 DOI: 10.1021/acs.jctc.5b00264] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The ability to accurately model solvent effects on free energy surfaces is important for understanding many biophysical processes including protein folding and misfolding, allosteric transitions, and protein–ligand binding. Although all-atom simulations in explicit solvent can provide an accurate model for biomolecules in solution, explicit solvent simulations are hampered by the slow equilibration on rugged landscapes containing multiple basins separated by barriers. In many cases, implicit solvent models can be used to significantly speed up the conformational sampling; however, implicit solvent simulations do not fully capture the effects of a molecular solvent, and this can lead to loss of accuracy in the estimated free energies. Here we introduce a new approach to compute free energy changes in which the molecular details of explicit solvent simulations are retained while also taking advantage of the speed of the implicit solvent simulations. In this approach, the slow equilibration in explicit solvent, due to the long waiting times before barrier crossing, is avoided by using a thermodynamic cycle which connects the free energy basins in implicit solvent and explicit solvent using a localized decoupling scheme. We test this method by computing conformational free energy differences and solvation free energies of the model system alanine dipeptide in water. The free energy changes between basins in explicit solvent calculated using fully explicit solvent paths agree with the corresponding free energy differences obtained using the implicit/explicit thermodynamic cycle to within 0.3 kcal/mol out of ∼3 kcal/mol at only ∼8% of the computational cost. We note that WHAM methods can be used to further improve the efficiency and accuracy of the implicit/explicit thermodynamic cycle.
Collapse
Affiliation(s)
- Nanjie Deng
- Center for Biophysics & Computational Biology and Institute for Computational Molecular Sciences Temple University, Philadelphia, Pennsylvania 19122, United States
| | | | | |
Collapse
|
31
|
Nguyen H, Pérez A, Bermeo S, Simmerling C. Refinement of Generalized Born Implicit Solvation Parameters for Nucleic Acids and Their Complexes with Proteins. J Chem Theory Comput 2015; 11:3714-28. [PMID: 26574454 PMCID: PMC4805114 DOI: 10.1021/acs.jctc.5b00271] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Generalized Born (GB) implicit solvent model has undergone significant improvements in accuracy for modeling of proteins and small molecules. However, GB still remains a less widely explored option for nucleic acid simulations, in part because fast GB models are often unable to maintain stable nucleic acid structures or they introduce structural bias in proteins, leading to difficulty in application of GB models in simulations of protein-nucleic acid complexes. Recently, GB-neck2 was developed to improve the behavior of protein simulations. In an effort to create a more accurate model for nucleic acids, a similar procedure to the development of GB-neck2 is described here for nucleic acids. The resulting parameter set significantly reduces absolute and relative energy error relative to Poisson-Boltzmann for both nucleic acids and nucleic acid-protein complexes, when compared to its predecessor GB-neck model. This improvement in solvation energy calculation translates to increased structural stability for simulations of DNA and RNA duplexes, quadruplexes, and protein-nucleic acid complexes. The GB-neck2 model also enables successful folding of small DNA and RNA hairpins to near native structures as determined from comparison with experiment. The functional form and all required parameters are provided here and also implemented in the AMBER software.
Collapse
Affiliation(s)
- Hai Nguyen
- Department of Chemistry, ‡Laufer Center for Physical and Quantitative Biology, and §Department of Biochemistry, Stony Brook University , Stony Brook, New York 11794, USA
| | - Alberto Pérez
- Department of Chemistry, ‡Laufer Center for Physical and Quantitative Biology, and §Department of Biochemistry, Stony Brook University , Stony Brook, New York 11794, USA
| | - Sherry Bermeo
- Department of Chemistry, ‡Laufer Center for Physical and Quantitative Biology, and §Department of Biochemistry, Stony Brook University , Stony Brook, New York 11794, USA
| | - Carlos Simmerling
- Department of Chemistry, ‡Laufer Center for Physical and Quantitative Biology, and §Department of Biochemistry, Stony Brook University , Stony Brook, New York 11794, USA
| |
Collapse
|
32
|
Bergonzo C, Galindo-Murillo R, Cheatham TE. Molecular modeling of nucleic Acid structure: electrostatics and solvation. CURRENT PROTOCOLS IN NUCLEIC ACID CHEMISTRY 2014; 55:7.9.1-27. [PMID: 25631536 DOI: 10.1002/0471142700.nc0709s55] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
This unit presents an overview of computer simulation techniques as applied to nucleic acid systems, ranging from simple in vacuo molecular modeling techniques to more complete all-atom molecular dynamics treatments that include an explicit representation of the environment. The third in a series of four units, this unit focuses on critical issues in solvation and the treatment of electrostatics. UNITS 7.5 & 7.8 introduced the modeling of nucleic acid structure at the molecular level. This included a discussion of how to generate an initial model, how to evaluate the utility or reliability of a given model, and ultimately how to manipulate this model to better understand its structure, dynamics, and interactions. Subject to an appropriate representation of the energy, such as a specifically parameterized empirical force field, the techniques of minimization and Monte Carlo simulation, as well as molecular dynamics (MD) methods, were introduced as a way of sampling conformational space for a better understanding of the relevance of a given model. This discussion highlighted the major limitations with modeling in general. When sampling conformational space effectively, difficult issues are encountered, such as multiple minima or conformational sampling problems, and accurately representing the underlying energy of interaction. In order to provide a realistic model of the underlying energetics for nucleic acids in their native environments, it is crucial to include some representation of solvation (by water) and also to properly treat the electrostatic interactions. These subjects are discussed in detail in this unit.
Collapse
Affiliation(s)
- Christina Bergonzo
- Department of Medicinal Chemistry, University of Utah, Salt Lake City, Utah
| | | | | |
Collapse
|
33
|
Topham CM, Smith JC. Tri-peptide reference structures for the calculation of relative solvent accessible surface area in protein amino acid residues. Comput Biol Chem 2014; 54:33-43. [PMID: 25544680 DOI: 10.1016/j.compbiolchem.2014.11.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Revised: 11/23/2014] [Accepted: 11/30/2014] [Indexed: 10/24/2022]
Abstract
Relative amino acid residue solvent accessibility values allow the quantitative comparison of atomic solvent-accessible surface areas in different residue types and physical environments in proteins and in protein structural alignments. Geometry-optimised tri-peptide structures in extended solvent-exposed reference conformations have been obtained for 43 amino acid residue types at a high level of quantum chemical theory. Significant increases in side-chain solvent accessibility, offset by reductions in main-chain atom solvent exposure, were observed for standard residue types in partially geometry-optimised structures when compared to non-minimised models built from identical sets of proper dihedral angles abstracted from the literature. Optimisation of proper dihedral angles led most notably to marked increases of up to 54% in proline main-chain atom solvent accessibility compared to literature values. Similar effects were observed for fully-optimised tri-peptides in implicit solvent. The relief of internal strain energy was associated with systematic variation in N, C(α) and C(β) atom solvent accessibility across all standard residue types. The results underline the importance of optimisation of 'hard' degrees of freedom (bond lengths and valence bond angles) and improper dihedral angle values from force field or other context-independent reference values, and impact on the use of standardised fixed internal co-ordinate geometry in sampling approaches to the determination of absolute values of protein amino acid residue solvent accessibility. Quantum chemical methods provide a useful and accurate alternative to molecular mechanics methods to perform energy minimisation of peptides containing non-standard (chemically modified) amino acid residues frequently present in experimental protein structure data sets, for which force field parameters may not be available. Reference tri-peptide atomic co-ordinate sets including hydrogen atoms are made freely available.
Collapse
Affiliation(s)
- Christopher M Topham
- Molecular Forces Consulting, 40 Rue Boyssonne, Toulouse 31400, France; Computational Molecular Biophysics, IWR der Universität Heidelberg, Im Neuenheimer Feld 368, Heidelberg D-69120, Germany; University of Tennessee/Oak Ridge National Laboratory, Center for Molecular Biophysics, P.O. Box 2008, Oak Ridge, TN 37831-6309, USA; Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, M407 Walters Life Sciences, 1414 Cumberland Avenue, Knoxville, TN 37996, USA.
| | - Jeremy C Smith
- Computational Molecular Biophysics, IWR der Universität Heidelberg, Im Neuenheimer Feld 368, Heidelberg D-69120, Germany; University of Tennessee/Oak Ridge National Laboratory, Center for Molecular Biophysics, P.O. Box 2008, Oak Ridge, TN 37831-6309, USA; Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, M407 Walters Life Sciences, 1414 Cumberland Avenue, Knoxville, TN 37996, USA
| |
Collapse
|
34
|
Carballo-Pacheco M, Vancea I, Strodel B. Extension of the FACTS Implicit Solvation Model to Membranes. J Chem Theory Comput 2014; 10:3163-76. [DOI: 10.1021/ct500084y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Martín Carballo-Pacheco
- Forschungszentrum Jülich GmbH, Institute of Complex
Systems: Structural Biochemistry (ICS-6), 52425 Jülich, Germany
| | - Ioan Vancea
- Forschungszentrum Jülich GmbH, Institute of Complex
Systems: Structural Biochemistry (ICS-6), 52425 Jülich, Germany
| | - Birgit Strodel
- Forschungszentrum Jülich GmbH, Institute of Complex
Systems: Structural Biochemistry (ICS-6), 52425 Jülich, Germany
- Institute
of Theoretical and Computational Chemistry, Heinrich Heine University Düsseldorf, Universitätstrasse 1, 40225 Düsseldorf, Germany
| |
Collapse
|
35
|
Onufriev AV, Aguilar B. Accuracy of continuum electrostatic calculations based on three common dielectric boundary definitions. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2014; 13. [PMID: 26236064 DOI: 10.1142/s0219633614400069] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
We investigate the influence of three common definitions of the solute/solvent dielectric boundary (DB) on the accuracy of the electrostatic solvation energy ΔGel computed within the Poisson Boltzmann and the generalized Born models of implicit solvation. The test structures include small molecules, peptides and small proteins; explicit solvent ΔGel are used as accuracy reference. For common atomic radii sets BONDI, PARSE (and ZAP9 for small molecules) the use of van der Waals (vdW) DB results, on average, in considerably larger errors in ΔGel than the molecular surface (MS) DB. The optimal probe radius ρw for which the MS DB yields the most accurate ΔGel varies considerably between structure types. The solvent accessible surface (SAS) DB becomes optimal at ρw ~ 0.2 Å (exact value is sensitive to the structure and atomic radii), at which point the average accuracy of ΔGel is comparable to that of the MS-based boundary. The geometric equivalence of SAS to vdW surface based on the same atomic radii uniformly increased by ρw gives the corresponding optimal vdW DB. For small molecules, the optimal vdW DB based on BONDI + 0.2 Å radii can yield ΔGel estimates at least as accurate as those based on the optimal MS DB. Also, in small molecules, pairwise charge-charge interactions computed with the optimal vdW DB are virtually equal to those computed with the MS DB, suggesting that in this case the two boundaries are practically equivalent by the electrostatic energy criteria. In structures other than small molecules, the optimal vdW and MS dielectric boundaries are not equivalent: the respective pairwise electrostatic interactions in the presence of solvent can differ by up to 5 kcal/mol for individual atomic pairs in small proteins, even when the total ΔGel are equal. For small proteins, the average decrease in pairwise electrostatic interactions resulting from the switch from optimal MS to optimal vdW DB definition can be mimicked within the MS DB definition by doubling of the solute dielectric constant. However, the use of the higher interior dielectric does not eliminate the large individual deviations between pairwise interactions computed within the two DB definitions. It is argued that while the MS based definition of the dielectric boundary is more physically correct in some types of practical calculations, the choice is not so clear in some other common scenarios.
Collapse
Affiliation(s)
- Alexey V Onufriev
- Department of Computer Science and Department of Physics, Virginia Tech, Blacksburg, VA 24060, and Department of Computer Science, Virginia Tech, Blacksburg, VA 24060
| | - Boris Aguilar
- Department of Computer Science and Department of Physics, Virginia Tech, Blacksburg, VA 24060, and Department of Computer Science, Virginia Tech, Blacksburg, VA 24060
| |
Collapse
|
36
|
Polydorides S, Simonson T. Monte Carlo simulations of proteins at constant pH with generalized Born solvent, flexible sidechains, and an effective dielectric boundary. J Comput Chem 2013; 34:2742-56. [PMID: 24122878 DOI: 10.1002/jcc.23450] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Revised: 09/04/2013] [Accepted: 09/08/2013] [Indexed: 12/11/2022]
Abstract
Titratable residues determine the acid/base behavior of proteins, strongly influencing their function; in addition, proton binding is a valuable reporter on electrostatic interactions. We describe a method for pK(a) calculations, using constant-pH Monte Carlo (MC) simulations to explore the space of sidechain conformations and protonation states, with an efficient and accurate generalized Born model (GB) for the solvent effects. To overcome the many-body dependency of the GB model, we use a "Native Environment" approximation, whose accuracy is shown to be good. It allows the precalculation and storage of interactions between all sidechain pairs, a strategy borrowed from computational protein design, which makes the MC simulations themselves very fast. The method is tested for 12 proteins and 167 titratable sidechains. It gives an rms error of 1.1 pH units, similar to the trivial "Null" model. The only adjustable parameter is the protein dielectric constant. The best accuracy is achieved for values between 4 and 8, a range that is physically plausible for a protein interior. For sidechains with large pKa shifts, ≥2, the rms error is 1.6, compared to 2.5 with the Null model and 1.5 with the empirical PROPKA method.
Collapse
Affiliation(s)
- Savvas Polydorides
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
37
|
Nguyen H, Roe DR, Simmerling C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. J Chem Theory Comput 2013; 9:2020-2034. [PMID: 25788871 DOI: 10.1021/ct3010485] [Citation(s) in RCA: 341] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The generalized Born (GB) model is one of the fastest implicit solvent models and it has become widely adopted for Molecular Dynamics (MD) simulations. This speed comes with tradeoffs, and many reports in the literature have pointed out weaknesses with GB models. Because the quality of a GB model is heavily affected by empirical parameters used in calculating solvation energy, in this work we have refit these parameters for GB-Neck, a recently developed GB model, in order to improve the accuracy of both the solvation energy and effective radii calculations. The data sets used for fitting are significantly larger than those used in the past. Comparing to other pairwise GB models like GB-OBC and the original GB-Neck, the new GB model (GB-Neck2) has better agreement to Poisson-Boltzmann (PB) in terms of reproducing solvation energies for a variety of systems ranging from peptides to proteins. Secondary structure preferences are also in much better agreement with those obtained from explicit solvent MD simulations. We also obtain near-quantitative reproduction of experimental structure and thermal stability profiles for several model peptides with varying secondary structure motifs. Extension to non-protein systems will be explored in the future.
Collapse
Affiliation(s)
- Hai Nguyen
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794 ; Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794
| | - Daniel R Roe
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794 ; Department of Medicinal Chemistry, University of Utah, Salt Lake City, UT, 84112
| | - Carlos Simmerling
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794 ; Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794
| |
Collapse
|
38
|
Brieg M, Wenzel W. PowerBorn: A Barnes-Hut Tree Implementation for Accurate and Efficient Born Radii Computation. J Chem Theory Comput 2013; 9:1489-98. [PMID: 26587611 DOI: 10.1021/ct300870s] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Implicit solvent models are one of the standard tools in computational biophysics. While Poisson-Boltzmann methods offer highly accurate results within this framework, generalized Born models have been used due to their higher computational efficiency in many (bio)molecular simulations, where computational power is a limiting factor. In recent years, there have been remarkable advances to reduce some deficiencies in the generalized Born models. On the other hand, these advances come at an increased computational cost that contrasts the reasons for choosing generalized Born models over Poisson-Boltzmann methods. To address this performance issue, we present a new algorithm for Born radii computation, one performance critical part in the evaluation of generalized Born models, which is based on a Barnes-Hut tree code scheme. We show that an implementation of this algorithm provides accurate Born radii and polar solvation free energies in comparison to Poisson-Boltzmann computations, while delivering up to an order of magnitude better performance over existing, similarly accurate methods. The C++ implementation of this algorithm will be available at http://www.int.kit.edu/nanosim/ .
Collapse
Affiliation(s)
- Martin Brieg
- Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany
| | - Wolfgang Wenzel
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology (KIT), P.O. Box 3640, 76021 Karlsruhe, Germany
| |
Collapse
|
39
|
Pang X, Zhou HX. Poisson-Boltzmann Calculations: van der Waals or Molecular Surface? COMMUNICATIONS IN COMPUTATIONAL PHYSICS 2013; 13:1-12. [PMID: 23293674 PMCID: PMC3535440 DOI: 10.4208/cicp.270711.140911s] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The Poisson-Boltzmann equation is widely used for modeling the electrostatics of biomolecules, but the calculation results are sensitive to the choice of the boundary between the low solute dielectric and the high solvent dielectric. The default choice for the dielectric boundary has been the molecular surface, but the use of the van der Waals surface has also been advocated. Here we review recent studies in which the two choices are tested against experimental results and explicit-solvent calculations. The assignment of the solvent high dielectric constant to interstitial voids in the solute is often used as a criticism against the van der Waals surface. However, this assignment may not be as unrealistic as previously thought, since hydrogen exchange and other NMR experiments have firmly established that all interior parts of proteins are transiently accessible to the solvent.
Collapse
Affiliation(s)
- Xiaodong Pang
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, Florida 32306, USA
| | | |
Collapse
|
40
|
Aguilar B, Onufriev AV. Efficient Computation of the Total Solvation Energy of Small Molecules via the R6 Generalized Born Model. J Chem Theory Comput 2012; 8:2404-11. [DOI: 10.1021/ct200786m] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Boris Aguilar
- Department
of Computer Science and ‡Department of Physics, Virginia Tech,
Blacksburg, Virginia 24060, United States
| | - Alexey V. Onufriev
- Department
of Computer Science and ‡Department of Physics, Virginia Tech,
Blacksburg, Virginia 24060, United States
| |
Collapse
|
41
|
Vorobjev YN. Potential of mean force of water-proton bath and molecular dynamic simulation of proteins at constant pH. J Comput Chem 2012; 33:832-42. [PMID: 22278814 DOI: 10.1002/jcc.22909] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Revised: 10/24/2011] [Accepted: 11/20/2011] [Indexed: 01/20/2023]
Abstract
An advanced implicit solvent model of water-proton bath for protein simulations at constant pH is presented. The implicit water-proton bath model approximates the potential of mean force of a protein in water solvent in a presence of hydrogen ions. Accurate and fast computational implementation of the implicit water-proton bath model is developed using the continuum electrostatic Poisson equation model for calculation of ionization equilibrium and the corrected MSR6 generalized Born model for calculation of the electrostatic atom-atom interactions and forces. Molecular dynamics (MD) method for protein simulation in the potential of mean force of water-proton bath is developed and tested on three proteins. The model allows to run MD simulations of proteins at constant pH, to calculate pH-dependent properties and free energies of protein conformations. The obtained results indicate that the developed implicit model of water-proton bath provides an efficient way to study thermodynamics of biomolecular systems as a function of pH, pH-dependent ionization-conformation coupling, and proton transfer events.
Collapse
Affiliation(s)
- Yury N Vorobjev
- Institute of Chemical Biology and Fundamental Medicine of the Siberian Branch of the Russian Academy of Science, Novosibirsk, Russia.
| |
Collapse
|
42
|
Onufriev AV, Sigalov G. A strategy for reducing gross errors in the generalized Born models of implicit solvation. J Chem Phys 2011; 134:164104. [PMID: 21528947 DOI: 10.1063/1.3578686] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The "canonical" generalized Born (GB) formula [C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson, J. Am. Chem. Soc. 112, 6127 (1990)] is known to provide accurate estimates for total electrostatic solvation energies ΔG(el) of biomolecules if the corresponding effective Born radii are accurate. Here we show that even if the effective Born radii are perfectly accurate, the canonical formula still exhibits significant number of gross errors (errors larger than 2k(B)T relative to numerical Poisson equation reference) in pairwise interactions between individual atomic charges. Analysis of exact analytical solutions of the Poisson equation (PE) for several idealized nonspherical geometries reveals two distinct spatial modes of the PE solution; these modes are also found in realistic biomolecular shapes. The canonical GB Green function misses one of two modes seen in the exact PE solution, which explains the observed gross errors. To address the problem and reduce gross errors of the GB formalism, we have used exact PE solutions for idealized nonspherical geometries to suggest an alternative analytical Green function to replace the canonical GB formula. The proposed functional form is mathematically nearly as simple as the original, but depends not only on the effective Born radii but also on their gradients, which allows for better representation of details of nonspherical molecular shapes. In particular, the proposed functional form captures both modes of the PE solution seen in nonspherical geometries. Tests on realistic biomolecular structures ranging from small peptides to medium size proteins show that the proposed functional form reduces gross pairwise errors in all cases, with the amount of reduction varying from more than an order of magnitude for small structures to a factor of 2 for the largest ones.
Collapse
Affiliation(s)
- Alexey V Onufriev
- Department of Computer Science, 2050 Torgersen Hall, Virginia Tech, Blacksburg, Virginia 24061, USA.
| | | |
Collapse
|
43
|
Vorobjev YN. Advances in implicit models of water solvent to compute conformational free energy and molecular dynamics of proteins at constant pH. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2011; 85:281-322. [PMID: 21920327 DOI: 10.1016/b978-0-12-386485-7.00008-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Modern implicit solvent models for macromolecular simulations in water-proton bath are considered. The fundamental quantity that implicit models approximate is the solute potential of mean force, which is obtained by averaging over solvent degrees of freedom. The implicit solvent models suggest practical ways to calculate free energies of macromolecular conformations taking into account equilibrium interactions with water solvent and proton bath, while the explicit solvent approach is unable to do that due to the need to account for a large number of solvent degrees of freedom. The most advanced realizations of the implicit continuum models by different research groups are discussed, their accuracy are examined, and some applications of the implicit solvent models to macromolecular modeling, such as free energy calculations, protein folding, and constant pH molecular dynamics are highlighted.
Collapse
|