2
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
3
|
Case D, Aktulga HM, Belfon K, Cerutti DS, Cisneros GA, Cruzeiro VD, Forouzesh N, Giese TJ, Götz AW, Gohlke H, Izadi S, Kasavajhala K, Kaymak MC, King E, Kurtzman T, Lee TS, Li P, Liu J, Luchko T, Luo R, Manathunga M, Machado MR, Nguyen HM, O’Hearn KA, Onufriev AV, Pan F, Pantano S, Qi R, Rahnamoun A, Risheh A, Schott-Verdugo S, Shajan A, Swails J, Wang J, Wei H, Wu X, Wu Y, Zhang S, Zhao S, Zhu Q, Cheatham TE, Roe DR, Roitberg A, Simmerling C, York DM, Nagan MC, Merz KM. AmberTools. J Chem Inf Model 2023; 63:6183-6191. [PMID: 37805934 PMCID: PMC10598796 DOI: 10.1021/acs.jcim.3c01153] [Citation(s) in RCA: 92] [Impact Index Per Article: 92.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Indexed: 10/10/2023]
Abstract
AmberTools is a free and open-source collection of programs used to set up, run, and analyze molecular simulations. The newer features contained within AmberTools23 are briefly described in this Application note.
Collapse
Affiliation(s)
- David
A. Case
- Department
of Chemistry and Chemical Biology, Rutgers
University, Piscataway 08854, New Jersey, United States
| | - Hasan Metin Aktulga
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Kellon Belfon
- FOG
Pharmaceuticals Inc., Cambridge 02140, Massachusetts, United States
| | - David S. Cerutti
- Psivant, 451 D Street, Suite 205, Boston 02210, Massachusetts, United States
| | - G. Andrés Cisneros
- Department
of Physics, Department of Chemistry and Biochemistry, University of Texas at Dallas, Richardson 75801, Texas, United States
| | - Vinícius
Wilian D. Cruzeiro
- Department
of Chemistry and The PULSE Institute, Stanford
University, Stanford 94305, California, United States
| | - Negin Forouzesh
- Department
of Computer Science, California State University, Los Angeles 90032, California, United States
| | - Timothy J. Giese
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Andreas W. Götz
- San
Diego Supercomputer Center, University of
California San Diego, La Jolla 92093-0505, California, United States
| | - Holger Gohlke
- Institute
for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, Düsseldorf 40225, Germany
- Institute
of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
| | - Saeed Izadi
- Pharmaceutical
Development, Genentech, Inc., South San Francisco 94080, California, United
States
| | - Koushik Kasavajhala
- Laufer
Center for Physical and Quantitative Biology, Department of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Mehmet C. Kaymak
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Edward King
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Tom Kurtzman
- Ph.D.
Programs in Chemistry, Biochemistry, and Biology, The Graduate Center of the City University of New York, 365 Fifth Avenue, New York 10016, New York, United States
- Department
of Chemistry, Lehman College, 250 Bedford Park Blvd West, Bronx 10468, New York, United States
| | - Tai-Sung Lee
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Pengfei Li
- Department
of Chemistry and Biochemistry, Loyola University
Chicago, Chicago 60660, Illinois, United States
| | - Jian Liu
- Beijing
National Laboratory for Molecular Sciences, Institute of Theoretical
and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Tyler Luchko
- Department
of Physics and Astronomy, California State
University, Northridge, Northridge 91330, California, United States
| | - Ray Luo
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Madushanka Manathunga
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| | | | - Hai Minh Nguyen
- Department
of Chemistry and Chemical Biology, Rutgers
University, Piscataway 08854, New Jersey, United States
| | - Kurt A. O’Hearn
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Alexey V. Onufriev
- Departments
of Computer Science and Physics, Virginia
Tech, Blacksburg 24061, Virginia, United
States
| | - Feng Pan
- Department
of Statistics, Florida State University, Tallahassee 32304, Florida, United States
| | - Sergio Pantano
- Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Ruxi Qi
- Cryo-EM
Center, Southern University of Science and
Technology, Shenzhen 518055, China
| | - Ali Rahnamoun
- Department
of Computer Science and Engineering, Michigan
State University, East Lansing 48824-1322, Michigan, United States
| | - Ali Risheh
- Department
of Computer Science, California State University, Los Angeles 90032, California, United States
| | - Stephan Schott-Verdugo
- Institute
of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
| | - Akhil Shajan
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| | - Jason Swails
- Entos, 4470 W Sunset
Blvd, Suite 107, Los Angeles 90027, California, United States
| | - Junmei Wang
- Department
of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh, Pittsburgh 15261, Pennsylvania, United States
| | - Haixin Wei
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Xiongwu Wu
- Laboratory
of Computational Biology, NHLBI, NIH, Bethesda 20892, Maryland, United States
| | - Yongxian Wu
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Shi Zhang
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Shiji Zhao
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
- Nurix Therapeutics, Inc., San Francisco 94158, California, United States
| | - Qiang Zhu
- Departments
of Molecular Biology and Biochemistry, Chemical and Biomolecular Engineering,
Materials Science and Engineering, and Biomedical Engineering, Graduate
Program in Chemical and Materials Physics, University of California, Irvine 92697, California, United States
| | - Thomas E. Cheatham
- Department
of Medicinal Chemistry, The University of
Utah, 30 South 2000 East, Salt Lake City 84112, Utah, United
States
| | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda 20892, Maryland, United States
| | - Adrian Roitberg
- Department
of Chemistry, The University of Florida, 440 Leigh Hall, Gainesville 32611-7200, Florida, United States
| | - Carlos Simmerling
- Laufer
Center for Physical and Quantitative Biology, Department of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Darrin M. York
- Laboratory
for Biomolecular Simulation Research, Institute for Quantitative Biomedicine
and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway 08854, New Jersey, United States
| | - Maria C. Nagan
- Department
of Chemistry, Stony Brook University, Stony Brook 11794, New York, United States
| | - Kenneth M. Merz
- Department
of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing 48824-1322, Michigan, United States
| |
Collapse
|
4
|
Wu Y, Wei H, Zhu Q, Luo R. Grid-Robust Efficient Neural Interface Model for Universal Molecule Surface Construction from Point Clouds. J Phys Chem Lett 2023; 14:9034-9041. [PMID: 37782231 PMCID: PMC10577766 DOI: 10.1021/acs.jpclett.3c02176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 09/26/2023] [Indexed: 10/03/2023]
Abstract
Molecular surfaces play a pivotal role in elucidating the properties and functions of biological complexes. While various surfaces have been proposed for specific scenarios, their widespread adoption faces challenges due to limited efficiency stemming from hand-crafted modeling designs. In this work, we proposed a general framework that incorporates both the point cloud concept and neural networks. The use of matrix multiplication in this framework enables efficient implementation across diverse platforms and libraries. We applied this framework to develop the GENIUSES (Grid-robust Efficient Neural Interface for Universal Solvent-Excluded Surface) model for constructing SES. GENIUSES demonstrates high accuracy and efficiency across data sets with varying conformations and complexities. Compared to the classical implementation of SES in the AMBER software package, our framework achieved a 26-fold speedup while retaining ∼95% accuracy when ported to the GPU platform using CUDA. Greater speedups can be obtained in large-scale systems. Importantly, our model exhibits robustness against variations in the grid spacing. We have integrated this infrastructure into AMBER to enhance accessibility for research in drug screening and related fields, where efficiency is of paramount importance.
Collapse
Affiliation(s)
- Yongxian Wu
- Departments
of Chemical and Biomolecular Engineering, Molecular Biology and Biochemistry,
Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, California 92697, United States
| | - Haixin Wei
- Department
of Chemistry and Biochemistry, University
of California, San Diego, California 92093, United States
| | - Qiang Zhu
- Departments
of Chemical and Biomolecular Engineering, Molecular Biology and Biochemistry,
Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, California 92697, United States
| | - Ray Luo
- Departments
of Chemical and Biomolecular Engineering, Molecular Biology and Biochemistry,
Materials Science and Engineering, and Biomedical Engineering, University of California, Irvine, California 92697, United States
| |
Collapse
|