1
|
Miao Z, Zhang X, Zhang Y, Wang L, Meng Q. Chemistry-Informed Generative Model for Classical Dynamics Simulations. J Phys Chem Lett 2024; 15:532-539. [PMID: 38194494 DOI: 10.1021/acs.jpclett.3c03114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
In this work, a chemistry-informed generative model was proposed, leading to the chemistry-informed generative adversarial network (CI-GAN) approach. To easily build the input database for complex molecular systems, an image-input algorithm is also implemented, leading to the capability to directly recognize the molecular image. Extensive test calculations and analysis on typical examples, H + H2, OH + HO2, and H2O/TiO2(110), find that the present CI-GAN approach generates distributions of geometry and energy. Calculations on the above examples show that the present CI-GAN approach is able to generate 50%-80% meaningful results among all of the generated data with chemistry constraints. Thus, it has the potential capability to predict classical dynamics simulations as well as ab initio calculations avoiding expensive calculations. These results and the power of CI-GANs in generating ab initio energies and MD trajectories are deeply discussed.
Collapse
Affiliation(s)
- Zekai Miao
- Department of Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Xingyu Zhang
- Department of Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Yuyuan Zhang
- Department of Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Lemei Wang
- Ministry-of-Education Engineering Center for Embedded System Integration, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| | - Qingyong Meng
- Department of Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi'an, China
| |
Collapse
|
2
|
Maier JC, Wang CI, Jackson NE. Distilling coarse-grained representations of molecular electronic structure with continuously gated message passing. J Chem Phys 2024; 160:024109. [PMID: 38193551 DOI: 10.1063/5.0179253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 12/14/2023] [Indexed: 01/10/2024] Open
Abstract
Bottom-up methods for coarse-grained (CG) molecular modeling are critically needed to establish rigorous links between atomistic reference data and reduced molecular representations. For a target molecule, the ideal reduced CG representation is a function of both the conformational ensemble of the system and the target physical observable(s) to be reproduced at the CG resolution. However, there is an absence of algorithms for selecting CG representations of molecules from which complex properties, including molecular electronic structure, can be accurately modeled. We introduce continuously gated message passing (CGMP), a graph neural network (GNN) method for atomically decomposing molecular electronic structure sampled over conformational ensembles. CGMP integrates 3D-invariant GNNs and a novel gated message passing system to continuously reduce the atomic degrees of freedom accessible for electronic predictions, resulting in a one-shot importance ranking of atoms contributing to a target molecular property. Moreover, CGMP provides the first approach by which to quantify the degeneracy of "good" CG representations conditioned on specific prediction targets, facilitating the development of more transferable CG representations. We further show how CGMP can be used to highlight multiatom correlations, illuminating a path to developing CG electronic Hamiltonians in terms of interpretable collective variables for arbitrarily complex molecules.
Collapse
Affiliation(s)
- J Charlie Maier
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Chun-I Wang
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
3
|
Airas J, Ding X, Zhang B. Transferable Implicit Solvation via Contrastive Learning of Graph Neural Networks. ACS CENTRAL SCIENCE 2023; 9:2286-2297. [PMID: 38161379 PMCID: PMC10755853 DOI: 10.1021/acscentsci.3c01160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/26/2023] [Accepted: 10/31/2023] [Indexed: 01/03/2024]
Abstract
Implicit solvent models are essential for molecular dynamics simulations of biomolecules, striking a balance between computational efficiency and biological realism. Efforts are underway to develop accurate and transferable implicit solvent models and coarse-grained (CG) force fields in general, guided by a bottom-up approach that matches the CG energy function with the potential of mean force (PMF) defined by the finer system. However, practical challenges arise due to the lack of analytical expressions for the PMF and algorithmic limitations in parameterizing CG force fields. To address these challenges, a machine learning-based approach is proposed, utilizing graph neural networks (GNNs) to represent the solvation free energy and potential contrasting for parameter optimization. We demonstrate the effectiveness of the approach by deriving a transferable GNN implicit solvent model using 600,000 atomistic configurations of six proteins obtained from explicit solvent simulations. The GNN model provides solvation free energy estimations much more accurately than state-of-the-art implicit solvent models, reproducing configurational distributions of explicit solvent simulations. We also demonstrate the reasonable transferability of the GNN model outside of the training data. Our study offers valuable insights for deriving systematically improvable implicit solvent models and CG force fields from a bottom-up perspective.
Collapse
Affiliation(s)
- Justin Airas
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| | - Xinqiang Ding
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| |
Collapse
|
4
|
Loose T, Sahrmann PG, Qu TS, Voth GA. Coarse-Graining with Equivariant Neural Networks: A Path Toward Accurate and Data-Efficient Models. J Phys Chem B 2023; 127:10564-10572. [PMID: 38033234 PMCID: PMC10726966 DOI: 10.1021/acs.jpcb.3c05928] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/30/2023] [Accepted: 11/09/2023] [Indexed: 12/02/2023]
Abstract
Machine learning has recently entered into the mainstream of coarse-grained (CG) molecular modeling and simulation. While a variety of methods for incorporating deep learning into these models exist, many of them involve training neural networks to act directly as the CG force field. This has several benefits of which the most significant is accuracy. Neural networks can inherently incorporate multibody effects during the calculation of CG forces, and a well-trained neural network force field outperforms pairwise basis sets generated from essentially any methodology. However, this comes at a significant cost. First, these models are typically slower than pairwise force fields, even when accounting for specialized hardware, which accelerates the training and integration of such networks. The second and the focus of this paper is the need for a considerable amount of data to train such force fields. It is common to use 10s of microseconds of molecular dynamics data to train a single CG model, which approaches the point of eliminating the CG model's usefulness in the first place. As we investigate in this work, this "data-hunger" trap from neural networks for predicting molecular energies and forces can be remediated in part by incorporating equivariant convolutional operations. We demonstrate that, for CG water, networks that incorporate equivariant convolutional operations can produce functional models using data sets as small as a single frame of reference data, while networks without these operations cannot.
Collapse
Affiliation(s)
| | | | - Thomas S. Qu
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, James Franck Institute,
and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, James Franck Institute,
and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
5
|
Sahrmann P, Loose TD, Durumeric AEP, Voth GA. Utilizing Machine Learning to Greatly Expand the Range and Accuracy of Bottom-Up Coarse-Grained Models through Virtual Particles. J Chem Theory Comput 2023; 19:4402-4413. [PMID: 36802592 PMCID: PMC10373655 DOI: 10.1021/acs.jctc.2c01183] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Indexed: 02/22/2023]
Abstract
Coarse-grained (CG) models parametrized using atomistic reference data, i.e., "bottom up" CG models, have proven useful in the study of biomolecules and other soft matter. However, the construction of highly accurate, low resolution CG models of biomolecules remains challenging. We demonstrate in this work how virtual particles, CG sites with no atomistic correspondence, can be incorporated into CG models within the context of relative entropy minimization (REM) as latent variables. The methodology presented, variational derivative relative entropy minimization (VD-REM), enables optimization of virtual particle interactions through a gradient descent algorithm aided by machine learning. We apply this methodology to the challenging case of a solvent-free CG model of a 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) lipid bilayer and demonstrate that introduction of virtual particles captures solvent-mediated behavior and higher-order correlations which REM alone cannot capture in a more standard CG model based only on the mapping of collections of atoms to the CG sites.
Collapse
Affiliation(s)
- Patrick
G. Sahrmann
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Timothy D. Loose
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Aleksander E. P. Durumeric
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Gregory A. Voth
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| |
Collapse
|
6
|
Durumeric AEP, Voth GA. Using classifiers to understand coarse-grained models and their fidelity with the underlying all-atom systems. J Chem Phys 2023; 158:234101. [PMID: 37318166 DOI: 10.1063/5.0146812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 05/26/2023] [Indexed: 06/16/2023] Open
Abstract
Bottom-up coarse-grained (CG) molecular dynamics models are parameterized using complex effective Hamiltonians. These models are typically optimized to approximate high dimensional data from atomistic simulations. However, human validation of these models is often limited to low dimensional statistics that do not necessarily differentiate between the CG model and said atomistic simulations. We propose that classification can be used to variationally estimate high dimensional error and that explainable machine learning can help convey this information to scientists. This approach is demonstrated using Shapley additive explanations and two CG protein models. This framework may also be valuable for ascertaining whether allosteric effects at the atomistic level are accurately propagated to a CG model.
Collapse
Affiliation(s)
- Aleksander E P Durumeric
- Department of Chemistry, Chicago Center for Theoretical Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, 5735 S. Ellis Ave., Chicago, Illinois 60637, USA
| | - Gregory A Voth
- Department of Chemistry, Chicago Center for Theoretical Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, 5735 S. Ellis Ave., Chicago, Illinois 60637, USA
| |
Collapse
|
7
|
Bryer AJ, Rey JS, Perilla JR. Performance efficient macromolecular mechanics via sub-nanometer shape based coarse graining. Nat Commun 2023; 14:2014. [PMID: 37037809 PMCID: PMC10086035 DOI: 10.1038/s41467-023-37801-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 03/30/2023] [Indexed: 04/12/2023] Open
Abstract
Dimensionality reduction via coarse grain modeling is a valuable tool in biomolecular research. For large assemblies, ultra coarse models are often knowledge-based, relying on a priori information to parameterize models thus hindering general predictive capability. Here, we present substantial advances to the shape based coarse graining (SBCG) method, which we refer to as SBCG2. SBCG2 utilizes a revitalized formulation of the topology representing network which makes high-granularity modeling possible, preserving atomistic details that maintain assembly characteristics. Further, we present a method of granularity selection based on charge density Fourier Shell Correlation and have additionally developed a refinement method to optimize, adjust and validate high-granularity models. We demonstrate our approach with the conical HIV-1 capsid and heteromultimeric cofilin-2 bound actin filaments. Our approach is available in the Visual Molecular Dynamics (VMD) software suite, and employs a CHARMM-compatible Hamiltonian that enables high-performance simulation in the GPU-resident NAMD3 molecular dynamics engine.
Collapse
Affiliation(s)
- Alexander J Bryer
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA
| | - Juan S Rey
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA
| | - Juan R Perilla
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA.
| |
Collapse
|
8
|
Chennakesavalu S, Toomer DJ, Rotskoff GM. Ensuring thermodynamic consistency with invertible coarse-graining. J Chem Phys 2023; 158:124126. [PMID: 37003724 DOI: 10.1063/5.0141888] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023] Open
Abstract
Coarse-grained models are a core computational tool in theoretical chemistry and biophysics. A judicious choice of a coarse-grained model can yield physical insights by isolating the essential degrees of freedom that dictate the thermodynamic properties of a complex, condensed-phase system. The reduced complexity of the model typically leads to lower computational costs and more efficient sampling compared with atomistic models. Designing "good" coarse-grained models is an art. Generally, the mapping from fine-grained configurations to coarse-grained configurations itself is not optimized in any way; instead, the energy function associated with the mapped configurations is. In this work, we explore the consequences of optimizing the coarse-grained representation alongside its potential energy function. We use a graph machine learning framework to embed atomic configurations into a low-dimensional space to produce efficient representations of the original molecular system. Because the representation we obtain is no longer directly interpretable as a real-space representation of the atomic coordinates, we also introduce an inversion process and an associated thermodynamic consistency relation that allows us to rigorously sample fine-grained configurations conditioned on the coarse-grained sampling. We show that this technique is robust, recovering the first two moments of the distribution of several observables in proteins such as chignolin and alanine dipeptide.
Collapse
Affiliation(s)
| | - David J Toomer
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
9
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
10
|
Trifan A, Gorgun D, Salim M, Li Z, Brace A, Zvyagin M, Ma H, Clyde A, Clark D, Hardy DJ, Burnley T, Huang L, McCalpin J, Emani M, Yoo H, Yin J, Tsaris A, Subbiah V, Raza T, Liu J, Trebesch N, Wells G, Mysore V, Gibbs T, Phillips J, Chennubhotla SC, Foster I, Stevens R, Anandkumar A, Vishwanath V, Stone JE, Tajkhorshid E, A. Harris S, Ramanathan A. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. THE INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 2022; 36:603-623. [PMID: 38464362 PMCID: PMC10923581 DOI: 10.1177/10943420221113513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside a human cell. Attacking RTC function with pharmaceutical compounds is a pathway to treating COVID-19. Conventional tools, e.g., cryo-electron microscopy and all-atom molecular dynamics (AAMD), do not provide sufficiently high resolution or timescale to capture important dynamics of this molecular machine. Consequently, we develop an innovative workflow that bridges the gap between these resolutions, using mesoscale fluctuating finite element analysis (FFEA) continuum simulations and a hierarchy of AI-methods that continually learn and infer features for maintaining consistency between AAMD and FFEA simulations. We leverage a multi-site distributed workflow manager to orchestrate AI, FFEA, and AAMD jobs, providing optimal resource utilization across HPC centers. Our study provides unprecedented access to study the SARS-CoV-2 RTC machinery, while providing general capability for AI-enabled multi-resolution simulations at scale.
Collapse
Affiliation(s)
- Anda Trifan
- Argonne National Laboratory
- University of Illinois Urbana-Champaign
| | - Defne Gorgun
- Argonne National Laboratory
- University of Illinois Urbana-Champaign
| | | | | | | | | | | | - Austin Clyde
- Argonne National Laboratory
- University of Chicago
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Ian Foster
- Argonne National Laboratory
- University of Chicago
| | - Rick Stevens
- Argonne National Laboratory
- University of Chicago
| | | | | | | | | | | | | |
Collapse
|
11
|
Jin J, Pak AJ, Durumeric AEP, Loose TD, Voth GA. Bottom-up Coarse-Graining: Principles and Perspectives. J Chem Theory Comput 2022; 18:5759-5791. [PMID: 36070494 PMCID: PMC9558379 DOI: 10.1021/acs.jctc.2c00643] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Indexed: 01/14/2023]
Abstract
Large-scale computational molecular models provide scientists a means to investigate the effect of microscopic details on emergent mesoscopic behavior. Elucidating the relationship between variations on the molecular scale and macroscopic observable properties facilitates an understanding of the molecular interactions driving the properties of real world materials and complex systems (e.g., those found in biology, chemistry, and materials science). As a result, discovering an explicit, systematic connection between microscopic nature and emergent mesoscopic behavior is a fundamental goal for this type of investigation. The molecular forces critical to driving the behavior of complex heterogeneous systems are often unclear. More problematically, simulations of representative model systems are often prohibitively expensive from both spatial and temporal perspectives, impeding straightforward investigations over possible hypotheses characterizing molecular behavior. While the reduction in resolution of a study, such as moving from an atomistic simulation to that of the resolution of large coarse-grained (CG) groups of atoms, can partially ameliorate the cost of individual simulations, the relationship between the proposed microscopic details and this intermediate resolution is nontrivial and presents new obstacles to study. Small portions of these complex systems can be realistically simulated. Alone, these smaller simulations likely do not provide insight into collectively emergent behavior. However, by proposing that the driving forces in both smaller and larger systems (containing many related copies of the smaller system) have an explicit connection, systematic bottom-up CG techniques can be used to transfer CG hypotheses discovered using a smaller scale system to a larger system of primary interest. The proposed connection between different CG systems is prescribed by (i) the CG representation (mapping) and (ii) the functional form and parameters used to represent the CG energetics, which approximate potentials of mean force (PMFs). As a result, the design of CG methods that facilitate a variety of physically relevant representations, approximations, and force fields is critical to moving the frontier of systematic CG forward. Crucially, the proposed connection between the system used for parametrization and the system of interest is orthogonal to the optimization used to approximate the potential of mean force present in all systematic CG methods. The empirical efficacy of machine learning techniques on a variety of tasks provides strong motivation to consider these approaches for approximating the PMF and analyzing these approximations.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Alexander J. Pak
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Aleksander E. P. Durumeric
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Timothy D. Loose
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
12
|
Mohajerani F, Tyukodi B, Schlicksup CJ, Hadden-Perilla JA, Zlotnick A, Hagan MF. Multiscale Modeling of Hepatitis B Virus Capsid Assembly and Its Dimorphism. ACS NANO 2022; 16:13845-13859. [PMID: 36054910 PMCID: PMC10273259 DOI: 10.1021/acsnano.2c02119] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Hepatitis B virus (HBV) is an endemic, chronic virus that leads to 800000 deaths per year. Central to the HBV lifecycle, the viral core has a protein capsid assembled from many copies of a single protein. The capsid protein adopts different (quasi-equivalent) conformations to form icosahedral capsids containing 180 or 240 proteins: T = 3 or T = 4, respectively, in Caspar-Klug nomenclature. HBV capsid assembly has become an important target for recently developed antivirals; nonetheless, the assembly pathways and mechanisms that control HBV dimorphism remain unclear. We describe computer simulations of the HBV assembly, using a coarse-grained model that has parameters learned from all-atom molecular dynamics simulations of a complete HBV capsid and yet is computationally tractable. Dynamical simulations with the resulting model reproduce experimental observations of HBV assembly pathways and products. By constructing Markov state models and employing transition path theory, we identify pathways leading to T = 3, T = 4, and other experimentally observed capsid morphologies. The analysis shows that capsid polymorphism is promoted by the low HBV capsid bending modulus, where the key factors controlling polymorphism are the conformational energy landscape and protein-protein binding affinities.
Collapse
Affiliation(s)
- Farzaneh Mohajerani
- Martin A. Fisher School of Physics, Brandeis University, Waltham, Massachusetts02453, United States
| | - Botond Tyukodi
- Martin A. Fisher School of Physics, Brandeis University, Waltham, Massachusetts02453, United States
- Department of Physics, Babeş-Bolyai University, 400084Cluj-Napoca, Romania
| | - Christopher J Schlicksup
- Molecular and Cellular Biochemistry Department, Indiana University, Bloomington, Indiana47405, United States
| | - Jodi A Hadden-Perilla
- Department of Chemistry & Biochemistry, University of Delaware, Newark, Delaware19716, United States
| | - Adam Zlotnick
- Molecular and Cellular Biochemistry Department, Indiana University, Bloomington, Indiana47405, United States
| | - Michael F Hagan
- Martin A. Fisher School of Physics, Brandeis University, Waltham, Massachusetts02453, United States
| |
Collapse
|
13
|
Vlachas PR, Zavadlav J, Praprotnik M, Koumoutsakos P. Accelerated Simulations of Molecular Systems through Learning of Effective Dynamics. J Chem Theory Comput 2021; 18:538-549. [PMID: 34890204 DOI: 10.1021/acs.jctc.1c00809] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Simulations are vital for understanding and predicting the evolution of complex molecular systems. However, despite advances in algorithms and special purpose hardware, accessing the time scales necessary to capture the structural evolution of biomolecules remains a daunting task. In this work, we present a novel framework to advance simulation time scales by up to 3 orders of magnitude by learning the effective dynamics (LED) of molecular systems. LED augments the equation-free methodology by employing a probabilistic mapping between coarse and fine scales using mixture density network (MDN) autoencoders and evolves the non-Markovian latent dynamics using long short-term memory MDNs. We demonstrate the effectiveness of LED in the Müller-Brown potential, the Trp cage protein, and the alanine dipeptide. LED identifies explainable reduced-order representations, i.e., collective variables, and can generate, at any instant, all-atom molecular trajectories consistent with the collective variables. We believe that the proposed framework provides a dramatic increase to simulation capabilities and opens new horizons for the effective modeling of complex molecular systems.
Collapse
Affiliation(s)
- Pantelis R Vlachas
- Computational Science and Engineering Laboratory, ETH Zurich, CH-8092, Switzerland
| | - Julija Zavadlav
- Professorship of Multiscale Modeling of Fluid Materials, TUM School of Engineering and Design, Technical University of Munich, 85748 Garching bei München, Germany.,Munich Data Science Institute, Technical University of Munich, 85748 Munich, Germany
| | - Matej Praprotnik
- Laboratory for Molecular Modeling, National Institute of Chemistry, SI-1001 Ljubljana, Slovenia.,Department of Physics, Faculty of Mathematics and Physics, University of Ljubljana, SI-1000 Ljubljana, Slovenia
| | - Petros Koumoutsakos
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| |
Collapse
|
14
|
Liebchen B, Mukhopadhyay AK. Interactions in active colloids. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 34:083002. [PMID: 34788232 DOI: 10.1088/1361-648x/ac3a86] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 11/16/2021] [Indexed: 06/13/2023]
Abstract
The past two decades have seen a remarkable progress in the development of synthetic colloidal agents which are capable of creating directed motion in an unbiased environment at the microscale. These self-propelling particles are often praised for their enormous potential to self-organize into dynamic nonequilibrium structures such as living clusters, synchronized super-rotor structures or self-propelling molecules featuring a complexity which is rarely found outside of the living world. However, the precise mechanisms underlying the formation and dynamics of many of these structures are still barely understood, which is likely to hinge on the gaps in our understanding of how active colloids interact. In particular, besides showing comparatively short-ranged interactions which are well known from passive colloids (Van der Waals, electrostatic etc), active colloids show novel hydrodynamic interactions as well as phoretic and substrate-mediated 'osmotic' cross-interactions which hinge on the action of the phoretic field gradients which are induced by the colloids on other colloids in the system. The present article discusses the complexity and the intriguing properties of these interactions which in general are long-ranged, non-instantaneous, non-pairwise and non-reciprocal and which may serve as key ingredients for the design of future nonequilibrium colloidal materials. Besides providing a brief overview on the state of the art of our understanding of these interactions a key aim of this review is to emphasize open key questions and corresponding open challenges.
Collapse
Affiliation(s)
- Benno Liebchen
- Institute for Condensed Matter Physics, Technische Universität Darmstadt, 64289 Darmstadt, Germany
| | - Aritra K Mukhopadhyay
- Institute for Condensed Matter Physics, Technische Universität Darmstadt, 64289 Darmstadt, Germany
| |
Collapse
|
15
|
Potter T, Barrett EL, Miller MA. Automated Coarse-Grained Mapping Algorithm for the Martini Force Field and Benchmarks for Membrane-Water Partitioning. J Chem Theory Comput 2021; 17:5777-5791. [PMID: 34472843 PMCID: PMC8444346 DOI: 10.1021/acs.jctc.1c00322] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Indexed: 01/08/2023]
Abstract
With a view to high-throughput simulations, we present an automated system for mapping and parameterizing organic molecules for use with the coarse-grained Martini force field. The method scales to larger molecules and a broader chemical space than existing schemes. The core of the mapping process is a graph-based analysis of the molecule's bonding network, which has the advantages of being fast, general, and preserving symmetry. The parameterization process pays special attention to coarse-grained beads in aromatic rings. It also includes a method for building efficient and stable frameworks of constraints for molecules with structural rigidity. The performance of the method is tested on a diverse set of 87 neutral organic molecules and the ability of the resulting models to capture octanol-water and membrane-water partition coefficients. In the latter case, we introduce an adaptive method for extracting partition coefficients from free-energy profiles to take into account the interfacial region of the membrane. We also use the models to probe the response of membrane-water partitioning to the cholesterol content of the membrane.
Collapse
Affiliation(s)
- Thomas
D. Potter
- Department
of Chemistry, Durham University, South Road, Durham DH1 3LE, United
Kingdom
| | - Elin L. Barrett
- Unilever
Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Mark A. Miller
- Department
of Chemistry, Durham University, South Road, Durham DH1 3LE, United
Kingdom
| |
Collapse
|
16
|
Cao X, Tian P. "Dividing and Conquering" and "Caching" in Molecular Modeling. Int J Mol Sci 2021; 22:5053. [PMID: 34068835 PMCID: PMC8126232 DOI: 10.3390/ijms22095053] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022] Open
Abstract
Molecular modeling is widely utilized in subjects including but not limited to physics, chemistry, biology, materials science and engineering. Impressive progress has been made in development of theories, algorithms and software packages. To divide and conquer, and to cache intermediate results have been long standing principles in development of algorithms. Not surprisingly, most important methodological advancements in more than half century of molecular modeling are various implementations of these two fundamental principles. In the mainstream classical computational molecular science, tremendous efforts have been invested on two lines of algorithm development. The first is coarse graining, which is to represent multiple basic particles in higher resolution modeling as a single larger and softer particle in lower resolution counterpart, with resulting force fields of partial transferability at the expense of some information loss. The second is enhanced sampling, which realizes "dividing and conquering" and/or "caching" in configurational space with focus either on reaction coordinates and collective variables as in metadynamics and related algorithms, or on the transition matrix and state discretization as in Markov state models. For this line of algorithms, spatial resolution is maintained but results are not transferable. Deep learning has been utilized to realize more efficient and accurate ways of "dividing and conquering" and "caching" along these two lines of algorithmic research. We proposed and demonstrated the local free energy landscape approach, a new framework for classical computational molecular science. This framework is based on a third class of algorithm that facilitates molecular modeling through partially transferable in resolution "caching" of distributions for local clusters of molecular degrees of freedom. Differences, connections and potential interactions among these three algorithmic directions are discussed, with the hope to stimulate development of more elegant, efficient and reliable formulations and algorithms for "dividing and conquering" and "caching" in complex molecular systems.
Collapse
Affiliation(s)
- Xiaoyong Cao
- School of Life Sciences, Jilin University, Changchun 130012, China;
| | - Pu Tian
- School of Life Sciences, Jilin University, Changchun 130012, China;
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
17
|
Ye H, Xian W, Li Y. Machine Learning of Coarse-Grained Models for Organic Molecules and Polymers: Progress, Opportunities, and Challenges. ACS OMEGA 2021; 6:1758-1772. [PMID: 33521417 PMCID: PMC7841771 DOI: 10.1021/acsomega.0c05321] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 01/04/2021] [Indexed: 05/02/2023]
Abstract
Machine learning (ML) has emerged as one of the most powerful tools transforming all areas of science and engineering. The nature of molecular dynamics (MD) simulations, complex and time-consuming calculations, makes them particularly suitable for ML research. This review article focuses on recent advancements in developing efficient and accurate coarse-grained (CG) models using various ML methods, in terms of regulating the coarse-graining process, constructing adequate descriptors/features, generating representative training data sets, and optimization of the loss function. Two classes of the CG models are introduced: bottom-up and top-down CG methods. To illustrate these methods and demonstrate the open methodological questions, we survey several important principles in constructing CG models and how these are incorporated into ML methods and improved with specific learning techniques. Finally, we discuss some key aspects of developing machine-learned CG models with high accuracy and efficiency. Besides, we describe how these aspects are tackled in state-of-the-art methods and which remain to be addressed in the near future. We expect that these machine-learned CG models can address thermodynamic consistent, transferable, and representative issues in classical CG models.
Collapse
Affiliation(s)
- Huilin Ye
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
| | - Weikang Xian
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
| | - Ying Li
- Department
of Mechanical Engineering, University of
Connecticut, Storrs, Connecticut 06269, United States
- Polymer
Program, Institute of Materials Science, University of Connecticut, Storrs, Connecticut 06269, United States
- E-mail: . Phone: +1 860 4867110. Fax: +1 860 4865088
| |
Collapse
|
18
|
Roel-Touris J, Bonvin AM. Coarse-grained (hybrid) integrative modeling of biomolecular interactions. Comput Struct Biotechnol J 2020; 18:1182-1190. [PMID: 32514329 PMCID: PMC7264466 DOI: 10.1016/j.csbj.2020.05.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 04/23/2020] [Accepted: 05/06/2020] [Indexed: 12/23/2022] Open
Abstract
The computational modeling field has vastly evolved over the past decades. The early developments of simplified protein systems represented a stepping stone towards establishing more efficient approaches to sample intricated conformational landscapes. Downscaling the level of resolution of biomolecules to coarser representations allows for studying protein structure, dynamics and interactions that are not accessible by classical atomistic approaches. The combination of different resolutions, namely hybrid modeling, has also been proved as an alternative when mixed levels of details are required. In this review, we provide an overview of coarse-grained/hybrid models focusing on their applicability in the modeling of biomolecular interactions. We give a detailed list of ready-to-use modeling software for studying biomolecular interactions allowing various levels of coarse-graining and provide examples of complexes determined by integrative coarse-grained/hybrid approaches in combination with experimental information.
Collapse
|
19
|
Abstract
Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation.
Collapse
Affiliation(s)
- Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; .,Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany.,Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA;
| | - Alexandre Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, 1511 Luxembourg, Luxembourg;
| | - Klaus-Robert Müller
- Department of Computer Science, Technical University Berlin, 10587 Berlin, Germany; .,Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany.,Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713, South Korea
| | - Cecilia Clementi
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; .,Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA; .,Department of Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|