1
|
Zeng J, Giese TJ, Ekesan Ş, York DM. Development of Range-Corrected Deep Learning Potentials for Fast, Accurate Quantum Mechanical/Molecular Mechanical Simulations of Chemical Reactions in Solution. J Chem Theory Comput 2021; 17:6993-7009. [PMID: 34644071 DOI: 10.1021/acs.jctc.1c00201] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We develop a new deep potential─range correction (DPRc) machine learning potential for combined quantum mechanical/molecular mechanical (QM/MM) simulations of chemical reactions in the condensed phase. The new range correction enables short-ranged QM/MM interactions to be tuned for higher accuracy, and the correction smoothly vanishes within a specified cutoff. We further develop an active learning procedure for robust neural network training. We test the DPRc model and training procedure against a series of six nonenzymatic phosphoryl transfer reactions in solution that are important in mechanistic studies of RNA-cleaving enzymes. Specifically, we apply DPRc corrections to a base QM model and test its ability to reproduce free-energy profiles generated from a target QM model. We perform these comparisons using the MNDO/d and DFTB2 semiempirical models because they differ in the way they treat orbital orthogonalization and electrostatics and produce free-energy profiles which differ significantly from each other, thereby providing us a rigorous stress test for the DPRc model and training procedure. The comparisons show that accurate reproduction of the free-energy profiles requires correction of the QM/MM interactions out to 6 Å. We further find that the model's initial training benefits from generating data from temperature replica exchange simulations and including high-temperature configurations into the fitting procedure, so the resulting models are trained to properly avoid high-energy regions. A single DPRc model was trained to reproduce four different reactions and yielded good agreement with the free-energy profiles made from the target QM/MM simulations. The DPRc model was further demonstrated to be transferable to 2D free-energy surfaces and 1D free-energy profiles that were not explicitly considered in the training. Examination of the computational performance of the DPRc model showed that it was fairly slow when run on CPUs but was sped up almost 100-fold when using NVIDIA V100 GPUs, resulting in almost negligible overhead. The new DPRc model and training procedure provide a potentially powerful new tool for the creation of next-generation QM/MM potentials for a wide spectrum of free-energy applications ranging from drug discovery to enzyme design.
Collapse
Affiliation(s)
- Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers the State University of New Jersey, New Brunswick, New Jersey 08901-8554, United States
| | - Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers the State University of New Jersey, New Brunswick, New Jersey 08901-8554, United States
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers the State University of New Jersey, New Brunswick, New Jersey 08901-8554, United States
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers the State University of New Jersey, New Brunswick, New Jersey 08901-8554, United States
| |
Collapse
|
2
|
Giese TJ, Ekesan Ş, York DM. Extension of the Variational Free Energy Profile and Multistate Bennett Acceptance Ratio Methods for High-Dimensional Potential of Mean Force Profile Analysis. J Phys Chem A 2021; 125:4216-4232. [PMID: 33784093 DOI: 10.1021/acs.jpca.1c00736] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
We redevelop the variational free energy profile (vFEP) method using a cardinal B-spline basis to extend the method for analyzing free energy surfaces (FESs) involving three or more reaction coordinates. We also implemented software for evaluating high-dimensional profiles based on the multistate Bennett acceptance ratio (MBAR) method which constructs an unbiased probability density from global reweighting of the observed samples. The MBAR method takes advantage of a fast algorithm for solving the unbinned weighted histogram (UWHAM)/MBAR equations which replaces the solution of simultaneous equations with a nonlinear optimization of a convex function. We make use of cardinal B-splines and multiquadric radial basis functions to obtain smooth, differentiable MBAR profiles in arbitrary high dimensions. The cardinal B-spline vFEP and MBAR methods are compared using three example systems that examine 1D, 2D, and 3D profiles. Both methods are found to be useful and produce nearly indistinguishable results. The vFEP method is found to be 150 times faster than MBAR when applied to periodic 2D profiles, but the MBAR method is 4.5 times faster than vFEP when evaluating unbounded 3D profiles. In agreement with previous comparisons, we find the vFEP method produces superior FESs when the overlap between umbrella window simulations decreases. Finally, the associative reaction mechanism of hammerhead ribozyme is characterized using 3D, 4D, and 6D profiles, and the higher-dimensional profiles are found to have smaller reaction barriers by as much as 1.5 kcal/mol. The methods presented here have been implemented into the FE-ToolKit software package along with new methods for network-wide free energy analysis in drug discovery.
Collapse
Affiliation(s)
- Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854-8087, United States
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854-8087, United States
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854-8087, United States
| |
Collapse
|
3
|
Gaines CS, Piccirilli JA, York DM. The L-platform/L-scaffold framework: a blueprint for RNA-cleaving nucleic acid enzyme design. RNA (NEW YORK, N.Y.) 2020; 26:111-125. [PMID: 31776179 PMCID: PMC6961537 DOI: 10.1261/rna.071894.119] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 11/14/2019] [Indexed: 05/13/2023]
Abstract
We develop an L-platform/L-scaffold framework we hypothesize may serve as a blueprint to facilitate site-specific RNA-cleaving nucleic acid enzyme design. Building on the L-platform motif originally described by Suslov and coworkers, we identify new critical scaffolding elements required to anchor a conserved general base guanine ("L-anchor") and bind functionally important metal ions at the active site ("L-pocket"). Molecular simulations, together with a broad range of experimental structural and functional data, connect the L-platform/L-scaffold elements to necessary and sufficient conditions for catalytic activity. We demonstrate that the L-platform/L-scaffold framework is common to five of the nine currently known naturally occurring ribozyme classes (Twr, HPr, VSr, HHr, Psr), and intriguingly from a design perspective, the framework also appears in an artificially engineered DNAzyme (8-17dz). The flexibility of the L-platform/L-scaffold framework is illustrated on these systems, highlighting modularity and trends in the variety of known general acid moieties that are supported. These trends give rise to two distinct catalytic paradigms, building on the classifications proposed by Wilson and coworkers and named for the implicated general base and acid. The "G + A" paradigm (Twr, HPr, VSr) exclusively utilizes nucleobase residues for chemistry, and the "G + M + " paradigm (HHr, 8-17dz, Psr) involves structuring of the "L-pocket" metal ion binding site for recruitment of a divalent metal ion that plays an active role in the chemical steps of the reaction. Finally, the modularity of the L-platform/L-scaffold framework is illustrated in the VS ribozyme where the "L-pocket" assumes the functional role of the "L-anchor" element, highlighting a distinct mechanism, but one that is functionally linked with the hammerhead ribozyme.
Collapse
Affiliation(s)
- Colin S Gaines
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Joseph A Piccirilli
- Department of Biochemistry and Molecular Biology and Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, USA
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine, and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
4
|
Ekesan Ş, York DM. Framework for Conducting and Analyzing Crystal Simulations of Nucleic Acids to Aid in Modern Force Field Evaluation. J Phys Chem B 2019; 123:4611-4624. [PMID: 31002511 PMCID: PMC6614744 DOI: 10.1021/acs.jpcb.8b11923] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Crystal simulations provide useful tools, along with solution simulations, to test nucleic acid force fields, but should be interpreted with care owing to the difficulty of establishing the environmental conditions needed to reproduce experimental crystal packing. These challenges underscore the need to construct proper protocols for carrying out crystal simulations and analyzing results to identify the origin of deviations from crystallographic data. Toward this end, we introduce a novel framework for B-factor decomposition into additive intramolecular, rotational, and translational atomic fluctuation components and partitioning of each of these components into individual asymmetric unit and lattice contributions. We apply the framework to a benchmark set of A-DNA, Z-DNA, and B-DNA double helix systems of various chain lengths. Overall, the intramolecular deviations from the crystal were quite small (≤1.0 Å), suggesting high accuracy of the force field, whereas crystal packing was not well reproduced. The present work establishes a framework to conduct and analyze crystal simulations that ultimately take on issues of crystal packing and can provide insight into nucleic acid force fields.
Collapse
Affiliation(s)
- Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology , Rutgers University , Piscataway , New Jersey 08854 , United States
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology , Rutgers University , Piscataway , New Jersey 08854 , United States
| |
Collapse
|
5
|
Giambasu GM, Case DA, York DM. Predicting Site-Binding Modes of Ions and Water to Nucleic Acids Using Molecular Solvation Theory. J Am Chem Soc 2019; 141:2435-2445. [PMID: 30632365 PMCID: PMC6574206 DOI: 10.1021/jacs.8b11474] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Site binding of ions and water shapes nucleic acids folding, dynamics, and biological function, complementing the more diffuse, nonspecific "territorial" ion binding. Unlike territorial binding, prediction of site-specific binding to nucleic acids remains an unsolved challenge in computational biophysics. This work presents a new toolset based on the 3D-RISM molecular solvation theory and topological analysis that predicts cation and water site binding to nucleic acids. 3D-RISM is shown to accurately capture alkali cations and water binding to the central channel, transversal loops, and grooves of the Oxytricha nova's telomeres' G-quadruplex ( Oxy-GQ), in agreement with high-resolution crystallographic data. To improve the computed cation occupancy along the Oxy-GQ central channel, it was necessary to refine and validate new cation-oxygen parameters using structural and thermodynamic data available for crown ethers and ion channels. This single set of parameters that describes both localized and delocalized binding to various biological systems is used to gain insight into cation occupancy along the Oxy-GQ channel under various salt conditions. The paper concludes with prospects for extending the method to predict divalent cation binding to nucleic acids. This work advances the forefront of theoretical methods able to provide predictive insight into ion atmosphere effects on nucleic acids function.
Collapse
Affiliation(s)
- George M. Giambasu
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, United States
| | - David A. Case
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, United States
| | - Darrin M. York
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, United States
- Laboratory for Biomolecular Simulation Research, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, United States
- Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, United States
| |
Collapse
|