1
|
Tolokh IS, Folescu DE, Onufriev AV. Inclusion of Water Multipoles into the Implicit Solvation Framework Leads to Accuracy Gains. J Phys Chem B 2024; 128:5855-5873. [PMID: 38860842 PMCID: PMC11194828 DOI: 10.1021/acs.jpcb.4c00254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 05/28/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024]
Abstract
The current practical "workhorses" of the atomistic implicit solvation─the Poisson-Boltzmann (PB) and generalized Born (GB) models─face fundamental accuracy limitations. Here, we propose a computationally efficient implicit solvation framework, the Implicit Water Multipole GB (IWM-GB) model, that systematically incorporates the effects of multipole moments of water molecules in the first hydration shell of a solute, beyond the dipole water polarization already present at the PB/GB level. The framework explicitly accounts for coupling between polar and nonpolar contributions to the total solvation energy, which is missing from many implicit solvation models. An implementation of the framework, utilizing the GAFF force field and AM1-BCC atomic partial charges model, is parametrized and tested against the experimental hydration free energies of small molecules from the FreeSolv database. The resulting accuracy on the test set (RMSE ∼ 0.9 kcal/mol) is 12% better than that of the explicit solvation (TIP3P) treatment, which is orders of magnitude slower. We also find that the coupling between polar and nonpolar parts of the solvation free energy is essential to ensuring that several features of the IWM-GB model are physically meaningful, including the sign of the nonpolar contributions.
Collapse
Affiliation(s)
- Igor S. Tolokh
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E. Folescu
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V. Onufriev
- Department
of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department
of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center
for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
2
|
Risheh A, Rebel A, Nerenberg PS, Forouzesh N. Calculation of protein-ligand binding entropies using a rule-based molecular fingerprint. Biophys J 2024:S0006-3495(24)00182-6. [PMID: 38481102 DOI: 10.1016/j.bpj.2024.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/21/2023] [Accepted: 03/08/2024] [Indexed: 03/28/2024] Open
Abstract
The use of fast in silico prediction methods for protein-ligand binding free energies holds significant promise for the initial phases of drug development. Numerous traditional physics-based models (e.g., implicit solvent models), however, tend to either neglect or heavily approximate entropic contributions to binding due to their computational complexity. Consequently, such methods often yield imprecise assessments of binding strength. Machine learning models provide accurate predictions and can often outperform physics-based models. They, however, are often prone to overfitting, and the interpretation of their results can be difficult. Physics-guided machine learning models combine the consistency of physics-based models with the accuracy of modern data-driven algorithms. This work integrates physics-based model conformational entropies into a graph convolutional network. We introduce a new neural network architecture (a rule-based graph convolutional network) that generates molecular fingerprints according to predefined rules specifically optimized for binding free energy calculations. Our results on 100 small host-guest systems demonstrate significant improvements in convergence and preventing overfitting. We additionally demonstrate the transferability of our proposed hybrid model by training it on the aforementioned host-guest systems and then testing it on six unrelated protein-ligand systems. Our new model shows little difference in training set accuracy compared to a previous model but an order-of-magnitude improvement in test set accuracy. Finally, we show how the results of our hybrid model can be interpreted in a straightforward fashion.
Collapse
Affiliation(s)
- Ali Risheh
- Department of Computer Science, California State University, Los Angeles, California
| | - Alles Rebel
- Department of Computer Science, California State University, Los Angeles, California
| | - Paul S Nerenberg
- Kravis Department of Integrated Sciences, Claremont McKenna College, Claremont, California
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California.
| |
Collapse
|
3
|
Chen J, Xu Y, Yang X, Cang Z, Geng W, Wei GW. Poisson-Boltzmann-based machine learning model for electrostatic analysis. Biophys J 2024:S0006-3495(24)00107-3. [PMID: 38356263 DOI: 10.1016/j.bpj.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 01/26/2024] [Accepted: 02/09/2024] [Indexed: 02/16/2024] Open
Abstract
Electrostatics is of paramount importance to chemistry, physics, biology, and medicine. The Poisson-Boltzmann (PB) theory is a primary model for electrostatic analysis. However, it is highly challenging to compute accurate PB electrostatic solvation free energies for macromolecules due to the nonlinearity, dielectric jumps, charge singularity, and geometric complexity associated with the PB equation. The present work introduces a PB-based machine learning (PBML) model for biomolecular electrostatic analysis. Trained with the second-order accurate MIBPB solver, the proposed PBML model is found to be more accurate and faster than several eminent PB solvers in electrostatic analysis. The proposed PBML model can provide highly accurate PB electrostatic solvation free energy of new biomolecules or new conformations generated by molecular dynamics with much reduced computational cost.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, University of Arkansas, Fayetteville, Arkansas
| | | | - Xin Yang
- Department of Mathematics, Southern Methodist University, Dallas, Texas
| | - Zixuan Cang
- Department of Mathematics, North Carolina State University, Raleigh, North Carolina
| | - Weihua Geng
- Department of Mathematics, Southern Methodist University, Dallas, Texas.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan.
| |
Collapse
|
4
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
5
|
Case D, Aktulga HM, Belfon K, Cerutti DS, Cisneros GA, Cruzeiro VD, Forouzesh N, Giese TJ, Götz AW, Gohlke H, Izadi S, Kasavajhala K, Kaymak MC, King E, Kurtzman T, Lee TS, Li P, Liu J, Luchko T, Luo R, Manathunga M, Machado MR, Nguyen HM, O’Hearn KA, Onufriev AV, Pan F, Pantano S, Qi R, Rahnamoun A, Risheh A, Schott-Verdugo S, Shajan A, Swails J, Wang J, Wei H, Wu X, Wu Y, Zhang S, Zhao S, Zhu Q, Cheatham TE, Roe DR, Roitberg A, Simmerling C, York DM, Nagan MC, Merz KM. AmberTools. J Chem Inf Model 2023; 63:6183-6191. [PMID: 37805934 PMCID: PMC10598796 DOI: 10.1021/acs.jcim.3c01153] [Citation(s) in RCA: 92] [Impact Index Per Article: 92.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Indexed: 10/10/2023]
Abstract
AmberTools is a free and open-source collection of programs used to set up, run, and analyze molecular simulations. The newer features contained within AmberTools23 are briefly described in this Application note.
Collapse
Affiliation(s)
- David
A. Case
- Department
of Chemistry and Chemical Biology, Rutgers
University, Piscataway 08854, New Jersey, United States
| | - Hasan Metin Aktulga
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Kellon Belfon
- FOG
Pharmaceuticals Inc., Cambridge 02140, Massachusetts, United States
| | - David S. Cerutti
- Psivant, 451 D Street, Suite 205, Boston 02210, Massachusetts, United States
| | - G. Andrés Cisneros
- Department
of Physics, Department of Chemistry and Biochemistry, University of Texas at Dallas, Richardson 75801, Texas, United States
| | - Vinícius
Wilian D. Cruzeiro
- Department
of Chemistry and The PULSE Institute, Stanford
University, Stanford 94305, California, United States
| | - Negin Forouzesh
- Department
of Computer Science, California State University, Los Angeles 90032, California, United States
| | - Timothy J. Giese
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Andreas W. Götz
- San
Diego Supercomputer Center, University of
California San Diego, La Jolla 92093-0505, California, United States
| | - Holger Gohlke
- Institute
for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, Düsseldorf 40225, Germany
- Institute
of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
| | - Saeed Izadi
- Pharmaceutical
Development, Genentech, Inc., South San Francisco 94080, California, United
States
| | - Koushik Kasavajhala
- Laufer
Center for Physical and Quantitative Biology, Department of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Mehmet C. Kaymak
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Edward King
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Tom Kurtzman
- Ph.D.
Programs in Chemistry, Biochemistry, and Biology, The Graduate Center of the City University of New York, 365 Fifth Avenue, New York 10016, New York, United States
- Department
of Chemistry, Lehman College, 250 Bedford Park Blvd West, Bronx 10468, New York, United States
| | - Tai-Sung Lee
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Pengfei Li
- Department
of Chemistry and Biochemistry, Loyola University
Chicago, Chicago 60660, Illinois, United States
| | - Jian Liu
- Beijing
National Laboratory for Molecular Sciences, Institute of Theoretical
and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Tyler Luchko
- Department
of Physics and Astronomy, California State
University, Northridge, Northridge 91330, California, United States
| | - Ray Luo
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Madushanka Manathunga
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| | | | - Hai Minh Nguyen
- Department
of Chemistry and Chemical Biology, Rutgers
University, Piscataway 08854, New Jersey, United States
| | - Kurt A. O’Hearn
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Alexey V. Onufriev
- Departments
of Computer Science and Physics, Virginia
Tech, Blacksburg 24061, Virginia, United
States
| | - Feng Pan
- Department
of Statistics, Florida State University, Tallahassee 32304, Florida, United States
| | - Sergio Pantano
- Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Ruxi Qi
- Cryo-EM
Center, Southern University of Science and
Technology, Shenzhen 518055, China
| | - Ali Rahnamoun
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Ali Risheh
- Department
of Computer Science, California State University, Los Angeles 90032, California, United States
| | - Stephan Schott-Verdugo
- Institute
of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
| | - Akhil Shajan
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| | - Jason Swails
- Entos, 4470 W Sunset
Blvd, Suite 107, Los Angeles 90027, California, United States
| | - Junmei Wang
- Department
of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh, Pittsburgh 15261, Pennsylvania, United States
| | - Haixin Wei
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Xiongwu Wu
- Laboratory
of Computational Biology, NHLBI, NIH, Bethesda 20892, Maryland, United States
| | - Yongxian Wu
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Shi Zhang
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Shiji Zhao
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
- Nurix Therapeutics, Inc., San Francisco 94158, California, United States
| | - Qiang Zhu
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Thomas E. Cheatham
- Department
of Medicinal Chemistry, The University of
Utah, 30 South 2000 East, Salt Lake City 84112, Utah, United
States
| | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda 20892, Maryland, United States
| | - Adrian Roitberg
- Department
of Chemistry, The University of Florida, 440 Leigh Hall, Gainesville 32611-7200, Florida, United States
| | - Carlos Simmerling
- Laufer
Center for Physical and Quantitative Biology, Department of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Darrin M. York
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Maria C. Nagan
- Department
of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Kenneth M. Merz
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| |
Collapse
|
6
|
Sagar D, Risheh A, Sheikh N, Forouzesh N. Physics-Guided Deep Generative Model for New Ligand Discovery. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2023; 2023:10.1145/3584371.3613067. [PMID: 38706556 PMCID: PMC11067829 DOI: 10.1145/3584371.3613067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Structure-based drug discovery aims to identify small molecules that can attach to a specific target protein and change its functionality. Recently, deep learning has shown great promise in generating drug-like molecules with specific biochemical features and conditioned with structural features. However, they usually fail to incorporate an essential factor: the underlying physics which guides molecular formation and binding in real-world scenarios. In this work, we describe a physics-guided deep generative model for new ligand discovery, conditioned not only on the binding site but also on physics-based features that describe the binding mechanism between a receptor and a ligand. The proposed hybrid model has been tested on large protein-ligand complexes and small host-guest systems. Using the top-N methodology, on average more than 75% of the generated structures by our hybrid model were stronger binders than the original reference ligand. All of them had higher ΔGbind (affinity) values than the ones generated by the previous state-of-the-art method by an average margin of 1.88 kcal/mol. The visualization of the top-5 ligands generated by the proposed physics-guided model and the reference deep learning model demonstrate more feasible conformations and orientations by the former. The future directions include training and testing the hybrid model on larger datasets, adding more relevant physics-based features, and interpreting the deep learning outcomes from biophysical perspectives.
Collapse
Affiliation(s)
- Dikshant Sagar
- Department of Computer Science, California State University, Los Angeles, Los Angeles, California, USA
| | - Ali Risheh
- Department of Computer Science, California State University, Los Angeles, Los Angeles, California, USA
| | - Nida Sheikh
- Department of Computer Science, California State University, Los Angeles Los Angeles, California, USA
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, Los Angeles, California, USA
| |
Collapse
|
7
|
Gao K, Wang R, Chen J, Cheng L, Frishcosy J, Huzumi Y, Qiu Y, Schluckbier T, Wei X, Wei GW. Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2. Chem Rev 2022; 122:11287-11368. [PMID: 35594413 PMCID: PMC9159519 DOI: 10.1021/acs.chemrev.1c00965] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Despite tremendous efforts in the past two years, our understanding of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), virus-host interactions, immune response, virulence, transmission, and evolution is still very limited. This limitation calls for further in-depth investigation. Computational studies have become an indispensable component in combating coronavirus disease 2019 (COVID-19) due to their low cost, their efficiency, and the fact that they are free from safety and ethical constraints. Additionally, the mechanism that governs the global evolution and transmission of SARS-CoV-2 cannot be revealed from individual experiments and was discovered by integrating genotyping of massive viral sequences, biophysical modeling of protein-protein interactions, deep mutational data, deep learning, and advanced mathematics. There exists a tsunami of literature on the molecular modeling, simulations, and predictions of SARS-CoV-2 and related developments of drugs, vaccines, antibodies, and diagnostics. To provide readers with a quick update about this literature, we present a comprehensive and systematic methodology-centered review. Aspects such as molecular biophysics, bioinformatics, cheminformatics, machine learning, and mathematics are discussed. This review will be beneficial to researchers who are looking for ways to contribute to SARS-CoV-2 studies and those who are interested in the status of the field.
Collapse
Affiliation(s)
- Kaifu Gao
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Jiahui Chen
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Limei Cheng
- Clinical
Pharmacology and Pharmacometrics, Bristol
Myers Squibb, Princeton, New Jersey 08536, United States
| | - Jaclyn Frishcosy
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuta Huzumi
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuchi Qiu
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Tom Schluckbier
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Xiaoqi Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
8
|
Cain S, Risheh A, Forouzesh N. A Physics-Guided Neural Network for Predicting Protein–Ligand Binding Free Energy: From Host–Guest Systems to the PDBbind Database. Biomolecules 2022; 12:biom12070919. [PMID: 35883475 PMCID: PMC9312865 DOI: 10.3390/biom12070919] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/26/2022] [Accepted: 06/27/2022] [Indexed: 11/16/2022] Open
Abstract
Calculation of protein–ligand binding affinity is a cornerstone of drug discovery. Classic implicit solvent models, which have been widely used to accomplish this task, lack accuracy compared to experimental references. Emerging data-driven models, on the other hand, are often accurate yet not fully interpretable and also likely to be overfitted. In this research, we explore the application of Theory-Guided Data Science in studying protein–ligand binding. A hybrid model is introduced by integrating Graph Convolutional Network (data-driven model) with the GBNSR6 implicit solvent (physics-based model). The proposed physics-data model is tested on a dataset of 368 complexes from the PDBbind refined set and 72 host–guest systems. Results demonstrate that the proposed Physics-Guided Neural Network can successfully improve the “accuracy” of the pure data-driven model. In addition, the “interpretability” and “transferability” of our model have boosted compared to the purely data-driven model. Further analyses include evaluating model robustness and understanding relationships between the physical features.
Collapse
Affiliation(s)
- Sahar Cain
- Department of Computer Science, California State University, Los Angeles, CA 90032, USA;
| | - Ali Risheh
- Department of Computer Engineering, Amirkabir University of Technology, Tehran 15914, Iran;
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, CA 90032, USA;
- Correspondence:
| |
Collapse
|
9
|
Abstract
Monte Carlo (MC) methods are important computational tools for molecular structure optimizations and predictions. When solvent effects are explicitly considered, MC methods become very expensive due to the large degree of freedom associated with the water molecules and mobile ions. Alternatively implicit-solvent MC can largely reduce the computational cost by applying a mean field approximation to solvent effects and meanwhile maintains the atomic detail of the target molecule. The two most popular implicit-solvent models are the Poisson-Boltzmann (PB) model and the Generalized Born (GB) model in a way such that the GB model is an approximation to the PB model but is much faster in simulation time. In this work, we develop a machine learning-based implicit-solvent Monte Carlo (MLIMC) method by combining the advantages of both implicit solvent models in accuracy and efficiency. Specifically, the MLIMC method uses a fast and accurate PB-based machine learning (PBML) scheme to compute the electrostatic solvation free energy at each step. We validate our MLIMC method by using a benzene-water system and a protein-water system. We show that the proposed MLIMC method has great advantages in speed and accuracy for molecular structure optimization and prediction.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Weihua Geng
- Department of Mathematics, Southern Methodist University, Dallas, TX 75275, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
10
|
Wang E, Fu W, Jiang D, Sun H, Wang J, Zhang X, Weng G, Liu H, Tao P, Hou T. VAD-MM/GBSA: A Variable Atomic Dielectric MM/GBSA Model for Improved Accuracy in Protein-Ligand Binding Free Energy Calculations. J Chem Inf Model 2021; 61:2844-2856. [PMID: 34014672 DOI: 10.1021/acs.jcim.1c00091] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The molecular mechanics/generalized Born surface area (MM/GBSA) has been widely used in end-point binding free energy prediction in structure-based drug design (SBDD). However, in practice, it is usually being treated as a disputed method mostly because of its system dependence. Here, combining with machine-learning optimization, we developed a novel version of MM/GBSA, named variable atomic dielectric MM/GBSA (VAD-MM/GBSA), by assigning variable dielectric constants directly to the protein/ligand atoms. The new strategy exhibits markedly improved accuracy in binding affinity calculations for various protein-ligand systems and is promising to be used in the postprocessing of structure-based virtual screening. Moreover, VAD-MM/GBSA outperformed prime MM/GBSA in Schrödinger software and showed remarkable predictive performance for specific protein targets, such as POL polyprotein, human immunodeficiency virus type 1 (HIV-1) protease, etc. Our study showed that the VAD-MM/GBSA method with little extra computational overhead provides a potential replacement of the MM/GBSA in AMBER software. An online web server of VAD-MMGBSA has been developed and is now available at http://cadd.zju.edu.cn/vdgb.
Collapse
Affiliation(s)
- Ercheng Wang
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Weitao Fu
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Dejun Jiang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
| | - Huiyong Sun
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Peng Tao
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
11
|
Forouzesh N, Mishra N. An Effective MM/GBSA Protocol for Absolute Binding Free Energy Calculations: A Case Study on SARS-CoV-2 Spike Protein and the Human ACE2 Receptor. Molecules 2021; 26:2383. [PMID: 33923909 PMCID: PMC8074138 DOI: 10.3390/molecules26082383] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/09/2021] [Accepted: 04/13/2021] [Indexed: 12/23/2022] Open
Abstract
The binding free energy calculation of protein-ligand complexes is necessary for research into virus-host interactions and the relevant applications in drug discovery. However, many current computational methods of such calculations are either inefficient or inaccurate in practice. Utilizing implicit solvent models in the molecular mechanics generalized Born surface area (MM/GBSA) framework allows for efficient calculations without significant loss of accuracy. Here, GBNSR6, a new flavor of the generalized Born model, is employed in the MM/GBSA framework for measuring the binding affinity between SARS-CoV-2 spike protein and the human ACE2 receptor. A computational protocol is developed based on the widely studied Ras-Raf complex, which has similar binding free energy to SARS-CoV-2/ACE2. Two options for representing the dielectric boundary of the complexes are evaluated: one based on the standard Bondi radii and the other based on a newly developed set of atomic radii (OPT1), optimized specifically for protein-ligand binding. Predictions based on the two radii sets provide upper and lower bounds on the experimental references: -14.7(ΔGbindBondi)<-10.6(ΔGbindExp.)<-4.1(ΔGbindOPT1) kcal/mol. The consensus estimates of the two bounds show quantitative agreement with the experiment values. This work also presents a novel truncation method and computational strategies for efficient entropy calculations with normal mode analysis. Interestingly, it is observed that a significant decrease in the number of snapshots does not affect the accuracy of entropy calculation, while it does lower computation time appreciably. The proposed MM/GBSA protocol can be used to study the binding mechanism of new variants of SARS-CoV-2, as well as other relevant structures.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, CA 90032, USA
| | - Nikita Mishra
- Department of Chemistry and Biochemistry, California State University, Los Angeles, CA 90032, USA;
| |
Collapse
|
12
|
Forouzesh N, Onufriev AV. MMGB/SA Consensus Estimate of the Binding Free Energy Between the Novel Coronavirus Spike Protein to the Human ACE2 Receptor. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.08.25.267625. [PMID: 32869029 PMCID: PMC7457614 DOI: 10.1101/2020.08.25.267625] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The ability to estimate protein-protein binding free energy in a computationally efficient via a physics-based approach is beneficial to research focused on the mechanism of viruses binding to their target proteins. Implicit solvation methodology may be particularly useful in the early stages of such research, as it can offer valuable insights into the binding process, quickly. Here we evaluate the potential of the related molecular mechanics generalized Born surface area (MMGB/SA) approach to estimate the binding free energy ΔGbind between the SARS-CoV-2 spike receptor-binding domain and the human ACE2 receptor. The calculations are based on a recent flavor of the generalized Born model, GBNSR6. Two estimates of ΔGbind are performed: one based on standard bondi radii, and the other based on a newly developed set of atomic radii (OPT1), optimized specifically for protein-ligand binding. We take the average of the resulting two ΔGbind values as the consensus estimate. For the well-studied Ras-Raf protein-protein complex, which has similar binding free energy to that of the SARS-CoV-2/ACE2 complex, the consensus ΔGbind = -11.8 ± 1 kcal/mol, vs. experimental -9.7 ± 0.2 kcal/mol. The consensus estimates for the SARS-CoV-2/ACE2 complex is ΔGbind = -9.4 ± 1.5 kcal/mol, which is in near quantitative agreement with experiment (-10.6 kcal/mol). The availability of a conceptually simple MMGB/SA-based protocol for analysis of the SARS-CoV-2 /ACE2 binding may be beneficial in light of the need to move forward fast.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, Los Angeles, CA 90032, USA
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
| |
Collapse
|
13
|
Forouzesh N, Mukhopadhyay A, Watson LT, Onufriev AV. Multidimensional Global Optimization and Robustness Analysis in the Context of Protein-Ligand Binding. J Chem Theory Comput 2020; 16:4669-4684. [PMID: 32450041 PMCID: PMC8594251 DOI: 10.1021/acs.jctc.0c00142] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Accuracy of protein-ligand binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a global multidimensional optimization pipeline is developed to find optimal atomic radii specifically for protein-ligand binding calculations in implicit solvent. The computational pipeline has these three key components: (1) a massively parallel implementation of a deterministic global optimization algorithm (VTDIRECT95), (2) an accurate yet reasonably fast generalized Born implicit solvent model (GBNSR6), and (3) a novel robustness metric that helps distinguish between nearly degenerate local minima via a postprocessing step of the optimization. A graph-based "kT-connectivity" approach to explore and visualize the multidimensional energy landscape is proposed: local minima that can be reached from the global minimum without exceeding a given energy threshold (kT) are considered to be connected. As an illustration of the capabilities of the optimization pipeline, we apply it to find a global optimum in the space of just five radii: four atomic (O, H, N, and C) radii and water probe radius. The optimized radii, ρW = 1.37 Å, ρC = 1.40 Å, ρH = 1.55 Å, ρN = 2.35 Å, and ρO = 1.28 Å, lead to a closer agreement of electrostatic binding free energies with the explicit solvent reference than two commonly used sets of radii previously optimized for small molecules. At the same time, the ability of the optimizer to find the global optimum reveals fundamental limits of the common two-dielectric implicit solvation model: the computed electrostatic binding free energies are still almost 4 kcal/mol away from the explicit solvent reference. The proposed computational approach opens the possibility to further improve the accuracy of practical computational protocols for binding free energy calculations.
Collapse
Affiliation(s)
- Negin Forouzesh
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Abhishek Mukhopadhyay
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Layne T Watson
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Aerospace and Ocean Engineering, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Polytechnic Institute & State University, Blacksburg, Virginia 24061, United States
| |
Collapse
|
14
|
Horvath D, Marcou G, Varnek A. "Big Data" Fast Chemoinformatics Model to Predict Generalized Born Radius and Solvent Accessibility as a Function of Geometry. J Chem Inf Model 2020; 60:2951-2965. [PMID: 32374171 DOI: 10.1021/acs.jcim.9b01172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The Generalized Born (GB) solvent model is offering the best accuracy/computing effort ratio yet requires drastic simplifications to estimate of the Effective Born Radii (EBR) in bypassing a too expensive volume integration step. EBRs are a measure of the degree of burial of an atom and not very sensitive to small changes of geometry: in molecular dynamics, the costly EBR update procedure is not mandatory at every step. This work however aims at implementing a GB model into the Sampler for Multiple Protein-Ligand Entities (S4MPLE) evolutionary algorithm with mandatory EBR updates at each step triggering arbitrarily large geometric changes. Therefore, a quantitative structure-property relationship has been developed in order to express the EBRs as a linear function of both the topological neighborhood and geometric occupancy of the space around atoms. A training set of 810 molecular systems, starting from fragment-like to drug-like compounds, proteins, host-guest systems, and ligand-protein complexes, has been compiled. For each species, S4MPLE generated several hundreds of random conformers. For each atom in each geometry of each species, its "standard" EBR was calculated by numeric integration and associated to topological and geometric descriptors of the atom neighborhood. This training set (EBR, atom descriptors) involving >5 M entries was subjected to a boot-strapping multilinear regression process with descriptor selection. In parallel, the strategy was repurposed to also learn atomic solvent-accessible areas (SA) based on the same descriptors. Resulting linear equations were challenged to predict EBR and SA values for a similarly compiled external set of >2000 new molecular systems. Solvation energies calculated with estimated EBR and SA match "standard" energies within the typical error of a force-field-based approach (a few kilocalories per mole). Given the extreme diversity of molecular systems covered by the model, this simple EBR/SA estimator covers a vast applicability domain.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| |
Collapse
|
15
|
Wang E, Liu H, Wang J, Weng G, Sun H, Wang Z, Kang Y, Hou T. Development and Evaluation of MM/GBSA Based on a Variable Dielectric GB Model for Predicting Protein–Ligand Binding Affinities. J Chem Inf Model 2020; 60:5353-5365. [DOI: 10.1021/acs.jcim.0c00024] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Ercheng Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Hui Liu
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Gaoqi Weng
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Huiyong Sun
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Zhe Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Yu Kang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou Zhejiang 310058, China
| |
Collapse
|
16
|
Wang E, Weng G, Sun H, Du H, Zhu F, Chen F, Wang Z, Hou T. Assessing the performance of the MM/PBSA and MM/GBSA methods. 10. Impacts of enhanced sampling and variable dielectric model on protein-protein Interactions. Phys Chem Chem Phys 2019; 21:18958-18969. [PMID: 31453590 DOI: 10.1039/c9cp04096j] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Enhanced sampling has been extensively used to capture the conformational transitions in protein folding, but it attracts much less attention in the studies of protein-protein recognition. In this study, we evaluated the impact of enhanced sampling methods and solute dielectric constants on the overall accuracy of the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) approaches for the protein-protein binding free energy calculations. Here, two widely used enhanced sampling methods, including aMD and GaMD, and conventional molecular dynamics (cMD) simulations with two AMBER force fields (ff03 and ff14SB) were used to sample the conformations for 21 protein-protein complexes. The MM/PBSA and MM/GBSA calculation results illustrate that the standard MM/GBSA based on the cMD simulations yields the best Pearson correlation (rp = -0.523) between the predicted binding affinities and the experimental data, which is much higher than that given by MM/PBSA (rp = -0.212). Two enhanced sampling methods (aMD and GaMD) are indeed more efficient for conformational sampling, but they did not improve the binding affinity predictions for protein-protein systems, suggesting that the aMD or GaMD sampling (at least in short timescale simulations) may not be a good choice for the MM/PBSA and MM/GBSA predictions of protein-protein complexes. The solute dielectric constant of 1.0 is recommended to MM/GBSA, but a higher solute dielectric constant is recommended to MM/PBSA, especially for the systems with higher polarity on the protein-protein binding interfaces. Then, a preliminary assessment of the MM/GBSA calculations based on a variable dielectric generalized Born (VDGB) model was conducted. The results highlight the potential power of VDGB in the free energy predictions for protein-protein systems, but more thorough studies should be done in the future.
Collapse
Affiliation(s)
- Ercheng Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Gaoqi Weng
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Huiyong Sun
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Hongyan Du
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Feng Zhu
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Fu Chen
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Zhe Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China. and State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
17
|
Wang E, Sun H, Wang J, Wang Z, Liu H, Zhang JZH, Hou T. End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. Chem Rev 2019; 119:9478-9508. [DOI: 10.1021/acs.chemrev.9b00055] [Citation(s) in RCA: 578] [Impact Index Per Article: 115.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Ercheng Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Huiyong Sun
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Zhe Wang
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU−ECNU Center for Computational Chemistry, NYU Shanghai, Shanghai 200122, China
- Department of Chemistry, New York University, New York, New York 10003, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Tingjun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
18
|
Abstract
It would often be useful in computer simulations to use an implicit description of solvation effects, instead of explicitly representing the individual solvent molecules. Continuum dielectric models often work well in describing the thermodynamic aspects of aqueous solvation and can be very efficient compared to the explicit treatment of the solvent. Here, we review a particular class of so-called fast implicit solvent models, generalized Born (GB) models, which are widely used for molecular dynamics (MD) simulations of proteins and nucleic acids. These approaches model hydration effects and provide solvent-dependent forces with efficiencies comparable to molecular-mechanics calculations on the solute alone; as such, they can be incorporated into MD or other conformational searching strategies in a straightforward manner. The foundations of the GB model are reviewed, followed by examples of newer, emerging models and examples of important applications. We discuss their strengths and weaknesses, both for fidelity to the underlying continuum model and for the ability to replace explicit consideration of solvent molecules in macromolecular simulations.
Collapse
Affiliation(s)
- Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24060, USA;
| | - David A Case
- Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA;
| |
Collapse
|
19
|
Tolokh IS, Thomas DG, Onufriev AV. Explicit ions/implicit water generalized Born model for nucleic acids. J Chem Phys 2018; 148:195101. [PMID: 30307229 DOI: 10.1063/1.5027260] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The ion atmosphere around highly charged nucleic acid molecules plays a significant role in their dynamics, structure, and interactions. Here we utilized the implicit solvent framework to develop a model for the explicit treatment of ions interacting with nucleic acid molecules. The proposed explicit ions/implicit water model is based on a significantly modified generalized Born (GB) model and utilizes a non-standard approach to define the solute/solvent dielectric boundary. Specifically, the model includes modifications to the GB interaction terms for the case of multiple interacting solutes-disconnected dielectric boundary around the solute-ion or ion-ion pairs. A fully analytical description of all energy components for charge-charge interactions is provided. The effectiveness of the approach is demonstrated by calculating the potential of mean force for Na+-Cl- ion pair and by carrying out a set of Monte Carlo (MC) simulations of mono- and trivalent ions interacting with DNA and RNA duplexes. The monovalent (Na+) and trivalent (CoHex3+) counterion distributions predicted by the model are in close quantitative agreement with all-atom explicit water molecular dynamics simulations used as reference. Expressed in the units of energy, the maximum deviations of local ion concentrations from the reference are within k B T. The proposed explicit ions/implicit water GB model is able to resolve subtle features and differences of CoHex distributions around DNA and RNA duplexes. These features include preferential CoHex binding inside the major groove of the RNA duplex, in contrast to CoHex biding at the "external" surface of the sugar-phosphate backbone of the DNA duplex; these differences in the counterion binding patters were earlier shown to be responsible for the observed drastic differences in condensation propensities between short DNA and RNA duplexes. MC simulations of CoHex ions interacting with the homopolymeric poly(dA·dT) DNA duplex with modified (de-methylated) and native thymine bases are used to explore the physics behind CoHex-thymine interactions. The simulations suggest that the ion desolvation penalty due to proximity to the low dielectric volume of the methyl group can contribute significantly to CoHex-thymine interactions. Compared to the steric repulsion between the ion and the methyl group, the desolvation penalty interaction has a longer range and may be important to consider in the context of methylation effects on DNA condensation.
Collapse
Affiliation(s)
- Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Dennis G Thomas
- Computational Biology, Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | - Alexey V Onufriev
- Departments of Computer Science and Physics, Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, USA
| |
Collapse
|
20
|
Huang Y, Harris RC, Shen J. Generalized Born Based Continuous Constant pH Molecular Dynamics in Amber: Implementation, Benchmarking and Analysis. J Chem Inf Model 2018; 58:1372-1383. [PMID: 29949356 DOI: 10.1021/acs.jcim.8b00227] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Solution pH plays an important role in structure and dynamics of biomolecular systems; however, pH effects cannot be accurately accounted for in conventional molecular dynamics simulations based on fixed protonation states. Continuous constant pH molecular dynamics (CpHMD) based on the λ-dynamics framework calculates protonation states on the fly during dynamical simulation at a specified pH condition. Here we report the CPU-based implementation of the CpHMD method based on the GBNeck2 generalized Born (GB) implicit-solvent model in the pmemd engine of the Amber molecular dynamics package. The performance of the method was tested using pH replica-exchange titration simulations of Asp, Glu and His side chains in 4 miniproteins and 7 enzymes with experimentally known p Ka's, some of which are significantly shifted from the model values. The added computational cost due to CpHMD titration ranges from 11 to 33% for the data set and scales roughly linearly as the ratio between the titrable sites and number of solute atoms. Comparison of the experimental and calculated p Ka's using 2 ns per replica sampling yielded a mean unsigned error of 0.70, a root-mean-squared error of 0.91, and a linear correlation coefficient of 0.79. Though this level of accuracy is similar to the GBSW-based CpHMD in CHARMM, in contrast to the latter, the current implementation was able to reproduce the experimental orders of the p Ka's of the coupled carboxylic dyads. We quantified the sampling errors, which revealed that prolonged simulation is needed to converge p Ka's of several titratable groups involved in salt-bridge-like interactions or deeply buried in the protein interior. Our benchmark data demonstrate that GBNeck2-CpHMD is an attractive tool for protein p Ka predictions.
Collapse
Affiliation(s)
- Yandong Huang
- Department of Pharmaceutical Sciences , University of Maryland School of Pharmacy , Baltimore , Maryland 21201 , United States
| | - Robert C Harris
- Department of Pharmaceutical Sciences , University of Maryland School of Pharmacy , Baltimore , Maryland 21201 , United States
| | - Jana Shen
- Department of Pharmaceutical Sciences , University of Maryland School of Pharmacy , Baltimore , Maryland 21201 , United States
| |
Collapse
|
21
|
Izadi S, Harris RC, Fenley MO, Onufriev AV. Accuracy Comparison of Generalized Born Models in the Calculation of Electrostatic Binding Free Energies. J Chem Theory Comput 2018; 14:1656-1670. [PMID: 29378399 DOI: 10.1021/acs.jctc.7b00886] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The need for accurate yet efficient representation of the aqueous environment in biomolecular modeling has led to the development of a variety of generalized Born (GB) implicit solvent models. While many studies have focused on the accuracy of available GB models in predicting solvation free energies, a systematic assessment of the quality of these models in binding free energy calculations, crucial for rational drug design, has not been undertaken. Here, we evaluate the accuracies of eight common GB flavors (GB-HCT, GB-OBC, GB-neck2, GBNSR6, GBSW, GBMV1, GBMV2, and GBMV3), available in major molecular dynamics packages, in predicting the electrostatic binding free energies ( ΔΔ Gel) for a diverse set of 60 biomolecular complexes belonging to four main classes: protein-protein, protein-drug, RNA-peptide, and small complexes. The GB flavors are examined in terms of their ability to reproduce the results from the Poisson-Boltzmann (PB) model, commonly used as accuracy reference in this context. We show that the agreement with the PB of ΔΔ Gel estimates varies widely between different GB models and also across different types of biomolecular complexes, with R2 correlations ranging from 0.3772 to 0.9986. A surface-based "R6" GB model recently implemented in AMBER shows the closest overall agreement with reference PB ( R2 = 0.9949, RMSD = 8.75 kcal/mol). The RNA-peptide and protein-drug complex sets appear to be most challenging for all but one model, as indicated by the large deviations from the PB in ΔΔ Gel. Small neutral complexes present the least challenge for most of the GB models tested. The quantitative demonstration of the strengths and weaknesses of the GB models across the diverse complex types provided here can be used as a guide for practical computations and future development efforts.
Collapse
Affiliation(s)
- Saeed Izadi
- Early Stage Pharmaceutical Development , Genentech Inc. , 1 DNA Way , South San Francisco , California 94080 , United States
| | - Robert C Harris
- Department of Pharmaceutical Sciences , University of Maryland School of Pharmacy , Baltimore , Maryland 21201 , United States
| | - Marcia O Fenley
- Institute of Molecular Biophysics , Florida State University , Tallahassee , Florida 32306-3408 , United States
| | | |
Collapse
|