1
|
Wu Z, Zhou T. Structural Coarse-Graining via Multiobjective Optimization with Differentiable Simulation. J Chem Theory Comput 2024; 20:2605-2617. [PMID: 38483262 DOI: 10.1021/acs.jctc.3c01348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
In the realm of multiscale molecular simulations, structure-based coarse-graining is a prominent approach for creating efficient coarse-grained (CG) representations of soft matter systems, such as polymers. This involves optimizing CG interactions by matching static correlation functions of the corresponding degrees of freedom in all-atom (AA) models. Here, we present a versatile method, namely, differentiable coarse-graining (DiffCG), which combines multiobjective optimization and differentiable simulation. The DiffCG approach is capable of constructing robust CG models by iteratively optimizing the effective potentials to simultaneously match multiple target properties. We demonstrate our approach by concurrently optimizing bonded and nonbonded potentials of a CG model of polystyrene (PS) melts. The resulting CG-PS model effectively reproduces both the structural characteristics, such as the equilibrium probability distribution of microscopic degrees of freedom and the thermodynamic pressure of the AA counterpart. More importantly, leveraging the multiobjective optimization capability, we develop a precise and efficient CG model for PS melts that is transferable across a wide range of temperatures, i.e., from 400 to 600 K. It is achieved via optimizing a pairwise potential with nonlinear temperature dependence in the CG model to simultaneously match target data from AA-MD simulations at multiple thermodynamic states. The temperature transferable CG-PS model demonstrates its ability to accurately predict the radial distribution functions and density at different temperatures, including those that are not included in the target thermodynamic states. Our work opens up a promising route for developing accurate and transferable CG models of complex soft-matter systems through multiobjective optimization with differentiable simulation.
Collapse
Affiliation(s)
- Zhenghao Wu
- Department of Chemistry, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, P. R. China
| | - Tianhang Zhou
- College of Carbon Neutrality Future Technology, State Key Laboratory of Heavy Oil Processing, China University of Petroleum (Beijing), Beijing 102249, P. R. China
| |
Collapse
|
2
|
Lesniewski MC, Noid WG. Insight into the Density-Dependence of Pair Potentials for Predictive Coarse-Grained Models. J Phys Chem B 2024; 128:1298-1316. [PMID: 38271676 DOI: 10.1021/acs.jpcb.3c06890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
We investigate the temperature- and density-dependence of effective pair potentials for 1-site coarse-grained (CG) models of two industrial solvents, 1,4-dioxane and tetrahydrofuran. We observe that the calculated pair potentials are much more sensitive to density than to temperature. The generalized-Yvon-Born-Green framework reveals that this striking density-dependence reflects corresponding variations in the many-body correlations that determine the environment-mediated indirect contribution to the pair mean force. Moreover, we demonstrate, perhaps surprisingly, that this density-dependence is not important for accurately modeling the intermolecular structure. Accordingly, we adopt a density-independent interaction potential and transfer the density-dependence of the calculated pair potentials into a configuration-independent volume potential. Furthermore, we develop a single global potential that accurately models the intermolecular structure and pressure-volume equation of state across a very wide range of liquid state points. Consequently, this work provides fundamental insight into the density-dependence of effective pair potentials and also provides a significant step toward developing predictive CG models for efficiently modeling industrial solvents.
Collapse
Affiliation(s)
- Maria C Lesniewski
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, United States
| | - W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, United States
| |
Collapse
|
3
|
Maier JC, Wang CI, Jackson NE. Distilling coarse-grained representations of molecular electronic structure with continuously gated message passing. J Chem Phys 2024; 160:024109. [PMID: 38193551 DOI: 10.1063/5.0179253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 12/14/2023] [Indexed: 01/10/2024] Open
Abstract
Bottom-up methods for coarse-grained (CG) molecular modeling are critically needed to establish rigorous links between atomistic reference data and reduced molecular representations. For a target molecule, the ideal reduced CG representation is a function of both the conformational ensemble of the system and the target physical observable(s) to be reproduced at the CG resolution. However, there is an absence of algorithms for selecting CG representations of molecules from which complex properties, including molecular electronic structure, can be accurately modeled. We introduce continuously gated message passing (CGMP), a graph neural network (GNN) method for atomically decomposing molecular electronic structure sampled over conformational ensembles. CGMP integrates 3D-invariant GNNs and a novel gated message passing system to continuously reduce the atomic degrees of freedom accessible for electronic predictions, resulting in a one-shot importance ranking of atoms contributing to a target molecular property. Moreover, CGMP provides the first approach by which to quantify the degeneracy of "good" CG representations conditioned on specific prediction targets, facilitating the development of more transferable CG representations. We further show how CGMP can be used to highlight multiatom correlations, illuminating a path to developing CG electronic Hamiltonians in terms of interpretable collective variables for arbitrarily complex molecules.
Collapse
Affiliation(s)
- J Charlie Maier
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Chun-I Wang
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
4
|
Coste A, Slejko E, Zavadlav J, Praprotnik M. Developing an Implicit Solvation Machine Learning Model for Molecular Simulations of Ionic Media. J Chem Theory Comput 2024; 20:411-420. [PMID: 38118122 PMCID: PMC10782447 DOI: 10.1021/acs.jctc.3c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/04/2023] [Accepted: 12/04/2023] [Indexed: 12/22/2023]
Abstract
Molecular dynamics (MD) simulations of biophysical systems require accurate modeling of their native environment, i.e., aqueous ionic solution, as it critically impacts the structure and function of biomolecules. On the other hand, the models should be computationally efficient to enable simulations of large spatiotemporal scales. Here, we present the deep implicit solvation model for sodium chloride solutions that satisfies both requirements. Owing to the use of the neural network potential, the model can capture the many-body potential of mean force, while the implicit water treatment renders the model inexpensive. We demonstrate our approach first for pure ionic solutions with concentrations ranging from physiological to 2 M. We then extend the model to capture the effective ion interactions in the vicinity and far away from a DNA molecule. In both cases, the structural properties are in good agreement with all-atom MD, showcasing a general methodology for the efficient and accurate modeling of ionic media.
Collapse
Affiliation(s)
- Amaury Coste
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
| | - Ema Slejko
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| | - Julija Zavadlav
- Professorship
of Multiscale Modeling of Fluid Materials, TUM School of Engineering
and Design, Technical University of Munich, Garching Near Munich DE-85748, Germany
| | - Matej Praprotnik
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| |
Collapse
|
5
|
Jones MS, Shmilovich K, Ferguson AL. DiAMoNDBack: Diffusion-Denoising Autoregressive Model for Non-Deterministic Backmapping of Cα Protein Traces. J Chem Theory Comput 2023; 19:7908-7923. [PMID: 37906711 DOI: 10.1021/acs.jctc.3c00840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Coarse-grained molecular models of proteins permit access to length and time scales unattainable by all-atom models and the simulation of processes that occur on long time scales, such as aggregation and folding. The reduced resolution realizes computational accelerations, but an atomistic representation can be vital for a complete understanding of mechanistic details. Backmapping is the process of restoring all-atom resolution to coarse-grained molecular models. In this work, we report DiAMoNDBack (Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping) as an autoregressive denoising diffusion probability model to restore all-atom details to coarse-grained protein representations retaining only Cα coordinates. The autoregressive generation process proceeds from the protein N-terminus to C-terminus in a residue-by-residue fashion conditioned on the Cα trace and previously backmapped backbone and side-chain atoms within the local neighborhood. The local and autoregressive nature of our model makes it transferable between proteins. The stochastic nature of the denoising diffusion process means that the model generates a realistic ensemble of backbone and side-chain all-atom configurations consistent with the coarse-grained Cα trace. We train DiAMoNDBack over 65k+ structures from the Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set, intrinsically disordered protein structures from the Protein Ensemble Database (PED), molecular dynamics simulations of fast-folding mini-proteins from DE Shaw Research, and coarse-grained simulation data. We achieve state-of-the-art reconstruction performance in terms of correct bond formation, avoidance of side-chain clashes, and the diversity of the generated side-chain configurational states. We make the DiAMoNDBack model publicly available as a free and open-source Python package.
Collapse
Affiliation(s)
- Michael S Jones
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
6
|
Peng Y, Pak AJ, Durumeric AEP, Sahrmann PG, Mani S, Jin J, Loose TD, Beiter J, Voth GA. OpenMSCG: A Software Tool for Bottom-Up Coarse-Graining. J Phys Chem B 2023; 127:8537-8550. [PMID: 37791670 PMCID: PMC10577682 DOI: 10.1021/acs.jpcb.3c04473] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/05/2023] [Indexed: 10/05/2023]
Abstract
The "bottom-up" approach to coarse-graining, for building accurate and efficient computational models to simulate large-scale and complex phenomena and processes, is an important approach in computational chemistry, biophysics, and materials science. As one example, the Multiscale Coarse-Graining (MS-CG) approach to developing CG models can be rigorously derived using statistical mechanics applied to fine-grained, i.e., all-atom simulation data for a given system. Under a number of circumstances, a systematic procedure, such as MS-CG modeling, is particularly valuable. Here, we present the development of the OpenMSCG software, a modularized open-source software that provides a collection of successful and widely applied bottom-up CG methods, including Boltzmann Inversion (BI), Force-Matching (FM), Ultra-Coarse-Graining (UCG), Relative Entropy Minimization (REM), Essential Dynamics Coarse-Graining (EDCG), and Heterogeneous Elastic Network Modeling (HeteroENM). OpenMSCG is a high-performance and comprehensive toolset that can be used to derive CG models from large-scale fine-grained simulation data in file formats from common molecular dynamics (MD) software packages, such as GROMACS, LAMMPS, and NAMD. OpenMSCG is modularized in the Python programming framework, which allows users to create and customize modeling "recipes" for reproducible results, thus greatly improving the reliability, reproducibility, and sharing of bottom-up CG models and their applications.
Collapse
Affiliation(s)
- Yuxing Peng
- NVIDIA
Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States
| | - Alexander J. Pak
- Department
of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | | | - Patrick G. Sahrmann
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Sriramvignesh Mani
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Jaehyeok Jin
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Timothy D. Loose
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Jeriann Beiter
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
7
|
Wellawatte GP, Hocky GM, White AD. Neural potentials of proteins extrapolate beyond training data. J Chem Phys 2023; 159:085103. [PMID: 37642255 PMCID: PMC10474891 DOI: 10.1063/5.0147240] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023] Open
Abstract
We evaluate neural network (NN) coarse-grained (CG) force fields compared to traditional CG molecular mechanics force fields. We conclude that NN force fields are able to extrapolate and sample from unseen regions of the free energy surface when trained with limited data. Our results come from 88 NN force fields trained on different combinations of clustered free energy surfaces from four protein mapped trajectories. We used a statistical measure named total variation similarity to assess the agreement between reference free energy surfaces from mapped atomistic simulations and CG simulations from trained NN force fields. Our conclusions support the hypothesis that NN CG force fields trained with samples from one region of the proteins' free energy surface can, indeed, extrapolate to unseen regions. Additionally, the force matching error was found to only be weakly correlated with a force field's ability to reconstruct the correct free energy surface.
Collapse
Affiliation(s)
- Geemi P. Wellawatte
- Department of Chemistry, University of Rochester, Rochester, New York 14627, USA
| | - Glen M. Hocky
- Department of Chemistry, Simons Center for Computational Physical Chemistry, New York University, New York, New York 10003, USA
| | - Andrew D. White
- Department of Chemical Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|