1
|
Yu Z, Jackson NE. Chemically Transferable Electronic Coarse Graining for Polythiophenes. J Chem Theory Comput 2024. [PMID: 39370933 DOI: 10.1021/acs.jctc.4c00804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Recent advances in machine-learning-based electronic coarse graining (ECG) methods have demonstrated the potential to enable electronic predictions in soft materials at mesoscopic length scales. However, previous ECG models have yet to confront the issue of chemical transferability. In this study, we develop chemically transferable ECG models for polythiophenes using graph neural networks. Our models are trained on a data set that samples over the conformational space of random polythiophene sequences generated with 15 different monomer chemistries and three different degrees of polymerization. We systematically explore the impact of coarse-grained representation on ECG accuracy, highlighting the significance of preserving the C-β coordinates in thiophene. We also find that integrating unique polymer sequences into training enhances the model performance more efficiently than augmenting conformational sampling for sequences already in the training data set. Moreover, our ECG models, developed initially for one property and one level of quantum chemical theory, can be efficiently transferred to related properties and higher levels of theory with minimal additional data. The chemically transferable ECG model introduced in this work will serve as a foundation model for new classes of chemically transferable ECG predictions across chemical space.
Collapse
Affiliation(s)
- Zheng Yu
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
2
|
Kidder KM, Noid WG. Analysis of mapping atomic models to coarse-grained resolution. J Chem Phys 2024; 161:134113. [PMID: 39365018 DOI: 10.1063/5.0220989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 09/10/2024] [Indexed: 10/05/2024] Open
Abstract
Low-resolution coarse-grained (CG) models provide significant computational and conceptual advantages for simulating soft materials. However, the properties of CG models depend quite sensitively upon the mapping, M, that maps each atomic configuration, r, to a CG configuration, R. In particular, M determines how the configurational information of the atomic model is partitioned between the mapped ensemble of CG configurations and the lost ensemble of atomic configurations that map to each R. In this work, we investigate how the mapping partitions the atomic configuration space into CG and intra-site components. We demonstrate that the corresponding coordinate transformation introduces a nontrivial Jacobian factor. This Jacobian factor defines a labeling entropy that corresponds to the uncertainty in the atoms that are associated with each CG site. Consequently, the labeling entropy effectively transfers configurational information from the lost ensemble into the mapped ensemble. Moreover, our analysis highlights the possibility of resonant mappings that separate the atomic potential into CG and intra-site contributions. We numerically illustrate these considerations with a Gaussian network model for the equilibrium fluctuations of actin. We demonstrate that the spectral quality, Q, provides a simple metric for identifying high quality representations for actin. Conversely, we find that neither maximizing nor minimizing the information content of the mapped ensemble results in high quality representations. However, if one accounts for the labeling uncertainty, Q(M) correlates quite well with the adjusted configurational information loss, Îmap(M), that results from the mapping.
Collapse
Affiliation(s)
- Katherine M Kidder
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
3
|
Bennett WFD, Bernardi A, Ozturk TN, Ingólfsson HI, Fox SJ, Sun D, Maupin CM. ezAlign: A Tool for Converting Coarse-Grained Molecular Dynamics Structures to Atomistic Resolution for Multiscale Modeling. Molecules 2024; 29:3557. [PMID: 39124960 PMCID: PMC11314399 DOI: 10.3390/molecules29153557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 07/22/2024] [Accepted: 07/25/2024] [Indexed: 08/12/2024] Open
Abstract
Soft condensed matter is challenging to study due to the vast time and length scales that are necessary to accurately represent complex systems and capture their underlying physics. Multiscale simulations are necessary to study processes that have disparate time and/or length scales, which abound throughout biology and other complex systems. Herein we present ezAlign, an open-source software for converting coarse-grained molecular dynamics structures to atomistic representation, allowing multiscale modeling of biomolecular systems. The ezAlign v1.1 software package is publicly available for download at github.com/LLNL/ezAlign. Its underlying methodology is based on a simple alignment of an atomistic template molecule, followed by position-restraint energy minimization, which forces the atomistic molecule to adopt a conformation consistent with the coarse-grained molecule. The molecules are then combined, solvated, minimized, and equilibrated with position restraints. Validation of the process was conducted on a pure POPC membrane and compared with other popular methods to construct atomistic membranes. Additional examples, including surfactant self-assembly, membrane proteins, and more complex bacterial and human plasma membrane models, are also presented. By providing these examples, parameter files, code, and an easy-to-follow recipe to add new molecules, this work will aid future multiscale modeling efforts.
Collapse
Affiliation(s)
- W. F. Drew Bennett
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (T.N.O.); (H.I.I.); (D.S.)
| | - Austen Bernardi
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (T.N.O.); (H.I.I.); (D.S.)
| | - Tugba Nur Ozturk
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (T.N.O.); (H.I.I.); (D.S.)
| | - Helgi I. Ingólfsson
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (T.N.O.); (H.I.I.); (D.S.)
| | | | - Delin Sun
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (T.N.O.); (H.I.I.); (D.S.)
| | - C. Mark Maupin
- Procter and Gamble, Mason, OH 45040, USA;
- Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
4
|
Farré-Gil D, Arcon JP, Laughton CA, Orozco M. CGeNArate: a sequence-dependent coarse-grained model of DNA for accurate atomistic MD simulations of kb-long duplexes. Nucleic Acids Res 2024; 52:6791-6801. [PMID: 38813824 PMCID: PMC11229373 DOI: 10.1093/nar/gkae444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 05/01/2024] [Accepted: 05/14/2024] [Indexed: 05/31/2024] Open
Abstract
We present CGeNArate, a new model for molecular dynamics simulations of very long segments of B-DNA in the context of biotechnological or chromatin studies. The developed method uses a coarse-grained Hamiltonian with trajectories that are back-mapped to the atomistic resolution level with extreme accuracy by means of Machine Learning Approaches. The method is sequence-dependent and reproduces very well not only local, but also global physical properties of DNA. The efficiency of the method allows us to recover with a reduced computational effort high-quality atomic-resolution ensembles of segments containing many kilobases of DNA, entering into the gene range or even the entire DNA of certain cellular organelles.
Collapse
Affiliation(s)
- David Farré-Gil
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, E-08028 Barcelona, Spain
| | - Juan Pablo Arcon
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, E-08028 Barcelona, Spain
| | - Charles A Laughton
- School of Pharmacy and Biodiscovery Institute, University of Nottingham, University Park, Nottingham NG7 2RD, UK
| | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, E-08028 Barcelona, Spain
- Department of Biochemistry and Biomedicine, University of Barcelona, E-08028 Barcelona, Spain
| |
Collapse
|
5
|
Ozturk TN, König M, Carpenter TS, Pedersen KB, Wassenaar TA, Ingólfsson HI, Marrink SJ. Building complex membranes with Martini 3. Methods Enzymol 2024; 701:237-285. [PMID: 39025573 DOI: 10.1016/bs.mie.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
The Martini model is a popular force field for coarse-grained simulations. Membranes have always been at the center of its development, with the latest version, Martini 3, showing great promise in capturing more and more realistic behavior. In this chapter we provide a step-by-step tutorial on how to construct starting configurations, run initial simulations and perform dedicated analysis for membrane-based systems of increasing complexity, including leaflet asymmetry, curvature gradients and embedding of membrane proteins.
Collapse
Affiliation(s)
- Tugba Nur Ozturk
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Melanie König
- Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Timothy S Carpenter
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | | | - Tsjerk A Wassenaar
- Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands; Institute for Life Science and Technology, Hanze University of Applied Sciences, Groningen, The Netherlands
| | - Helgi I Ingólfsson
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States.
| | - Siewert J Marrink
- Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands.
| |
Collapse
|
6
|
Kharche S, Yadav M, Hande V, Prakash S, Sengupta D. Improved Protein Dynamics and Hydration in the Martini3 Coarse-Grain Model. J Chem Inf Model 2024; 64:837-850. [PMID: 38291973 DOI: 10.1021/acs.jcim.3c00802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
The Martini coarse-grain force-field has emerged as an important framework to probe cellular processes at experimentally relevant time- and length-scales. However, the recently developed version, the Martini3 force-field with the implemented Go̅ model (Martini3Go̅), as well as previous variants of the Martini model have not been benchmarked and rigorously tested for globular proteins. In this study, we consider three globular proteins, ubiquitin, lysozyme, and cofilin, and compare protein dynamics and hydration with observables from experiments and all-atom simulations. We show that the Martini3Go̅ model is able to accurately model the structural and dynamic features of small globular proteins. Overall, the structural integrity of the proteins is maintained, as validated by contact maps, radii of gyration (Rg), and SAXS profiles. The chemical shifts predicted from the ensemble sampled in the simulations are consistent with the experimental data. Further, a good match is observed in the protein-water interaction energetics, and the hydration levels of the residues are similar to atomistic simulations. However, the protein-water interaction dynamics is not accurately represented and appears to depend on the protein structural complexity, residue specificity, and water dynamics. Our work is a step toward testing and assessing the Martini3Go̅ model and provides insights into future efforts to refine Martini models with improved solvation effects and better correspondence to the underlying all-atom systems.
Collapse
Affiliation(s)
- Shalmali Kharche
- CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India
| | - Manjul Yadav
- CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India
| | - Vrushali Hande
- CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India
| | - Shikha Prakash
- CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India
| | - Durba Sengupta
- CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
7
|
Kidder KM, Shell MS, Noid WG. Surveying the energy landscape of coarse-grained mappings. J Chem Phys 2024; 160:054105. [PMID: 38310476 DOI: 10.1063/5.0182524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 12/28/2023] [Indexed: 02/05/2024] Open
Abstract
Simulations of soft materials often adopt low-resolution coarse-grained (CG) models. However, the CG representation is not unique and its impact upon simulated properties is poorly understood. In this work, we investigate the space of CG representations for ubiquitin, which is a typical globular protein with 72 amino acids. We employ Monte Carlo methods to ergodically sample this space and to characterize its landscape. By adopting the Gaussian network model as an analytically tractable atomistic model for equilibrium fluctuations, we exactly assess the intrinsic quality of each CG representation without introducing any approximations in sampling configurations or in modeling interactions. We focus on two metrics, the spectral quality and the information content, that quantify the extent to which the CG representation preserves low-frequency, large-amplitude motions and configurational information, respectively. The spectral quality and information content are weakly correlated among high-resolution representations but become strongly anticorrelated among low-resolution representations. Representations with maximal spectral quality appear consistent with physical intuition, while low-resolution representations with maximal information content do not. Interestingly, quenching studies indicate that the energy landscape of mapping space is very smooth and highly connected. Moreover, our study suggests a critical resolution below which a "phase transition" qualitatively distinguishes good and bad representations.
Collapse
Affiliation(s)
- Katherine M Kidder
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - M Scott Shell
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, USA
| | - W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
8
|
Maier JC, Wang CI, Jackson NE. Distilling coarse-grained representations of molecular electronic structure with continuously gated message passing. J Chem Phys 2024; 160:024109. [PMID: 38193551 DOI: 10.1063/5.0179253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 12/14/2023] [Indexed: 01/10/2024] Open
Abstract
Bottom-up methods for coarse-grained (CG) molecular modeling are critically needed to establish rigorous links between atomistic reference data and reduced molecular representations. For a target molecule, the ideal reduced CG representation is a function of both the conformational ensemble of the system and the target physical observable(s) to be reproduced at the CG resolution. However, there is an absence of algorithms for selecting CG representations of molecules from which complex properties, including molecular electronic structure, can be accurately modeled. We introduce continuously gated message passing (CGMP), a graph neural network (GNN) method for atomically decomposing molecular electronic structure sampled over conformational ensembles. CGMP integrates 3D-invariant GNNs and a novel gated message passing system to continuously reduce the atomic degrees of freedom accessible for electronic predictions, resulting in a one-shot importance ranking of atoms contributing to a target molecular property. Moreover, CGMP provides the first approach by which to quantify the degeneracy of "good" CG representations conditioned on specific prediction targets, facilitating the development of more transferable CG representations. We further show how CGMP can be used to highlight multiatom correlations, illuminating a path to developing CG electronic Hamiltonians in terms of interpretable collective variables for arbitrarily complex molecules.
Collapse
Affiliation(s)
- J Charlie Maier
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Chun-I Wang
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
9
|
Koukos PI, Dehghani-Ghahnaviyeh S, Velez-Vega C, Manchester J, Tieleman DP, Duca JS, Souza PCT, Cournia Z. Martini 3 Force Field Parameters for Protein Lipidation Post-Translational Modifications. J Chem Theory Comput 2023; 19:8901-8918. [PMID: 38019969 DOI: 10.1021/acs.jctc.3c00604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Protein lipidations are vital co/post-translational modifications that tether lipid tails to specific protein amino acids, allowing them to anchor to biological membranes, switch their subcellular localization, and modulate association with other proteins. Such lipidations are thus crucial for multiple biological processes including signal transduction, protein trafficking, and membrane localization and are implicated in various diseases as well. Examples of lipid-anchored proteins include the Ras family of proteins that undergo farnesylation; actin and gelsolin that are myristoylated; phospholipase D that is palmitoylated; glycosylphosphatidylinositol-anchored proteins; and others. Here, we develop parameters for cysteine-targeting farnesylation, geranylgeranylation, and palmitoylation, as well as glycine-targeting myristoylation for the latest version of the Martini 3 coarse-grained force field. The parameters are developed using the CHARMM36m all-atom force field parameters as reference. The behavior of the coarse-grained models is consistent with that of the all-atom force field for all lipidations and reproduces key dynamical and structural features of lipid-anchored peptides, such as the solvent-accessible surface area, bilayer penetration depth, and representative conformations of the anchors. The parameters are also validated in simulations of the lipid-anchored peripheral membrane proteins Rheb and Arf1, after comparison with independent all-atom simulations. The parameters, along with mapping schemes for the popular martinize2 tool, are available for download at 10.5281/zenodo.7849262 and also as supporting information.
Collapse
Affiliation(s)
- Panagiotis I Koukos
- Biomedical Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| | - Sepehr Dehghani-Ghahnaviyeh
- Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Camilo Velez-Vega
- Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - John Manchester
- Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - D Peter Tieleman
- Department of Biological Sciences, University of Calgary, Calgary T2N 1N4 Alberta, Canada
- Centre for Molecular Simulation, University of Calgary, Calgary T2N 1N4 Alberta, Canada
| | - José S Duca
- Computer-Aided Drug Discovery, Global Discovery Chemistry, Novartis Institutes for BioMedical Research, 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Paulo C T Souza
- Molecular Microbiology and Structural Biochemistry, (MMSB, UMR 5086), CNRS & University of Lyon, 69367 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69364 Lyon, France
| | - Zoe Cournia
- Biomedical Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| |
Collapse
|
10
|
Kim S. Backmapping with Mapping and Isomeric Information. J Phys Chem B 2023. [PMID: 38049145 DOI: 10.1021/acs.jpcb.3c05593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
I present a powerful and flexible backmapping tool named Multiscale Simulation Tool (mstool) that converts a coarse-grained (CG) system into all-atom (AA) resolution and only requires AA to CG mapping and isomeric information (cis/trans/dihedral/chiral). The backmapping procedure includes two simple steps: (a) AA atoms are randomly placed near the corresponding CG beads according to the provided mapping scheme. (b) Energy minimization is performed with two modifications in the AA force field (FF). First, nonbonded interactions are replaced with cosine functions to ensure the numerical stability. Second, additional torsions are imposed to maintain the molecules' isomeric properties. To test the simplicity and robustness of the tool, I backmapped multiple membrane and protein CG structures into AA resolution, including a four-bead CG lipid model (resolution increased by a factor of 34) without using intermediate resolution. The tool is freely available at github.com/ksy141/mstool.
Collapse
Affiliation(s)
- Siyoung Kim
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637 United States
| |
Collapse
|
11
|
Ingólfsson H, Bhatia H, Aydin F, Oppelstrup T, López CA, Stanton LG, Carpenter TS, Wong S, Di Natale F, Zhang X, Moon JY, Stanley CB, Chavez JR, Nguyen K, Dharuman G, Burns V, Shrestha R, Goswami D, Gulten G, Van QN, Ramanathan A, Van Essen B, Hengartner NW, Stephen AG, Turbyville T, Bremer PT, Gnanakaran S, Glosli JN, Lightstone FC, Nissley DV, Streitz FH. Machine Learning-Driven Multiscale Modeling: Bridging the Scales with a Next-Generation Simulation Infrastructure. J Chem Theory Comput 2023; 19:2658-2675. [PMID: 37075065 PMCID: PMC10173464 DOI: 10.1021/acs.jctc.2c01018] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Indexed: 04/20/2023]
Abstract
Interdependence across time and length scales is common in biology, where atomic interactions can impact larger-scale phenomenon. Such dependence is especially true for a well-known cancer signaling pathway, where the membrane-bound RAS protein binds an effector protein called RAF. To capture the driving forces that bring RAS and RAF (represented as two domains, RBD and CRD) together on the plasma membrane, simulations with the ability to calculate atomic detail while having long time and large length- scales are needed. The Multiscale Machine-Learned Modeling Infrastructure (MuMMI) is able to resolve RAS/RAF protein-membrane interactions that identify specific lipid-protein fingerprints that enhance protein orientations viable for effector binding. MuMMI is a fully automated, ensemble-based multiscale approach connecting three resolution scales: (1) the coarsest scale is a continuum model able to simulate milliseconds of time for a 1 μm2 membrane, (2) the middle scale is a coarse-grained (CG) Martini bead model to explore protein-lipid interactions, and (3) the finest scale is an all-atom (AA) model capturing specific interactions between lipids and proteins. MuMMI dynamically couples adjacent scales in a pairwise manner using machine learning (ML). The dynamic coupling allows for better sampling of the refined scale from the adjacent coarse scale (forward) and on-the-fly feedback to improve the fidelity of the coarser scale from the adjacent refined scale (backward). MuMMI operates efficiently at any scale, from a few compute nodes to the largest supercomputers in the world, and is generalizable to simulate different systems. As computing resources continue to increase and multiscale methods continue to advance, fully automated multiscale simulations (like MuMMI) will be commonly used to address complex science questions.
Collapse
Affiliation(s)
- Helgi
I. Ingólfsson
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Harsh Bhatia
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Fikret Aydin
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Tomas Oppelstrup
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Cesar A. López
- Theoretical
Biology and Biophysics Group, Los Alamos
National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Liam G. Stanton
- Department
of Mathematics and Statistics, San José
State University, San José, California 95192, United States
| | - Timothy S. Carpenter
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Sergio Wong
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Francesco Di Natale
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Xiaohua Zhang
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Joseph Y. Moon
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Christopher B. Stanley
- Computational
Sciences and Engineering Division, Oak Ridge
National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Joseph R. Chavez
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Kien Nguyen
- Theoretical
Biology and Biophysics Group, Los Alamos
National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Gautham Dharuman
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Violetta Burns
- Theoretical
Biology and Biophysics Group, Los Alamos
National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Rebika Shrestha
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Debanjan Goswami
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Gulcin Gulten
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Que N. Van
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Arvind Ramanathan
- Computing,
Environment & Life Sciences (CELS) Directorate, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Brian Van Essen
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - Nicolas W. Hengartner
- Theoretical
Biology and Biophysics Group, Los Alamos
National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Andrew G. Stephen
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Thomas Turbyville
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Peer-Timo Bremer
- Computing
Directorate, Lawrence Livermore National
Laboratory, Livermore, California 94550, United States
| | - S. Gnanakaran
- Theoretical
Biology and Biophysics Group, Los Alamos
National Laboratory, Los Alamos, New Mexico 87545, United States
| | - James N. Glosli
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Felice C. Lightstone
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - Dwight V. Nissley
- RAS Initiative,
The Cancer Research Technology Program, Frederick National Laboratory, Frederick, Maryland 21701, United States
| | - Frederick H. Streitz
- Physical
and Life Sciences (PLS) Directorate, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| |
Collapse
|
12
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
13
|
Stevens JA, Grünewald F, van Tilburg PAM, König M, Gilbert BR, Brier TA, Thornburg ZR, Luthey-Schulten Z, Marrink SJ. Molecular dynamics simulation of an entire cell. Front Chem 2023; 11:1106495. [PMID: 36742032 PMCID: PMC9889929 DOI: 10.3389/fchem.2023.1106495] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/09/2023] [Indexed: 01/19/2023] Open
Abstract
The ultimate microscope, directed at a cell, would reveal the dynamics of all the cell's components with atomic resolution. In contrast to their real-world counterparts, computational microscopes are currently on the brink of meeting this challenge. In this perspective, we show how an integrative approach can be employed to model an entire cell, the minimal cell, JCVI-syn3A, at full complexity. This step opens the way to interrogate the cell's spatio-temporal evolution with molecular dynamics simulations, an approach that can be extended to other cell types in the near future.
Collapse
Affiliation(s)
- Jan A. Stevens
- Molecular Dynamics Group, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, Netherlands
| | - Fabian Grünewald
- Molecular Dynamics Group, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, Netherlands
| | - P. A. Marco van Tilburg
- Molecular Dynamics Group, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, Netherlands
| | - Melanie König
- Molecular Dynamics Group, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, Netherlands
| | - Benjamin R. Gilbert
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Champaign, IL, United States
| | - Troy A. Brier
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Champaign, IL, United States
| | - Zane R. Thornburg
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Champaign, IL, United States
| | - Zaida Luthey-Schulten
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Champaign, IL, United States
| | - Siewert J. Marrink
- Molecular Dynamics Group, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, Netherlands
| |
Collapse
|
14
|
Christofi E, Chazirakis A, Chrysostomou C, Nicolaou MA, Li W, Doxastakis M, Harmandaris VA. Deep convolutional neural networks for generating atomistic configurations of multi-component macromolecules from coarse-grained models. J Chem Phys 2022; 157:184903. [DOI: 10.1063/5.0110322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Despite the modern advances in the available computational resources, the length and time scales of the physical systems that can be studied in full atomic detail, via molecular simulations, are still limited. To overcome such limitations, coarse-grained (CG) models have been developed to reduce the dimensionality of the physical system under study. However, to study such systems at the atomic level, it is necessary to re-introduce the atomistic details into the CG description. Such an ill-posed mathematical problem is typically treated via numerical algorithms, which need to balance accuracy, efficiency, and general applicability. Here, we introduce an efficient and versatile method for backmapping multi-component CG macromolecules of arbitrary microstructures. By utilizing deep learning algorithms, we train a convolutional neural network to learn structural correlations between polymer configurations at the atomistic and their corresponding CG descriptions, obtained from atomistic simulations. The trained model is then utilized to get predictions of atomistic structures from input CG configurations. As an illustrative example, we apply the convolutional neural network to polybutadiene copolymers of various microstructures, in which each monomer microstructure (i.e., cis-1,4, trans-1,4, and vinyl-1,2) is represented as a different CG particle type. The proposed methodology is transferable over molecular weight and various microstructures. Moreover, starting from a specific single CG configuration with a given microstructure, we show that by modifying its chemistry (i.e., CG particle types), we are able to obtain a set of well equilibrated polymer configurations of different microstructures (chemistry) than the one of the original CG configuration.
Collapse
Affiliation(s)
- Eleftherios Christofi
- Computation-based Science and Technology Research Center, The Cyprus Institute, Nicosia 2121, Cyprus
| | - Antonis Chazirakis
- Department of Mathematics and Applied Mathematics, University of Crete, Heraklion GR-71110, Greece
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology–Hellas, GR-71110 Heraklion, Crete, Greece
| | - Charalambos Chrysostomou
- Computation-based Science and Technology Research Center, The Cyprus Institute, Nicosia 2121, Cyprus
| | - Mihalis A. Nicolaou
- Computation-based Science and Technology Research Center, The Cyprus Institute, Nicosia 2121, Cyprus
| | - Wei Li
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo 001-0021, Japan
| | - Manolis Doxastakis
- Department of Chemical and Biomolecular Engineering, University of Tennessee, Knoxville, Tennessee 37996, USA
| | - Vagelis A. Harmandaris
- Computation-based Science and Technology Research Center, The Cyprus Institute, Nicosia 2121, Cyprus
- Department of Mathematics and Applied Mathematics, University of Crete, Heraklion GR-71110, Greece
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology–Hellas, GR-71110 Heraklion, Crete, Greece
| |
Collapse
|
15
|
Jin J, Pak AJ, Durumeric AEP, Loose TD, Voth GA. Bottom-up Coarse-Graining: Principles and Perspectives. J Chem Theory Comput 2022; 18:5759-5791. [PMID: 36070494 PMCID: PMC9558379 DOI: 10.1021/acs.jctc.2c00643] [Citation(s) in RCA: 84] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Indexed: 01/14/2023]
Abstract
Large-scale computational molecular models provide scientists a means to investigate the effect of microscopic details on emergent mesoscopic behavior. Elucidating the relationship between variations on the molecular scale and macroscopic observable properties facilitates an understanding of the molecular interactions driving the properties of real world materials and complex systems (e.g., those found in biology, chemistry, and materials science). As a result, discovering an explicit, systematic connection between microscopic nature and emergent mesoscopic behavior is a fundamental goal for this type of investigation. The molecular forces critical to driving the behavior of complex heterogeneous systems are often unclear. More problematically, simulations of representative model systems are often prohibitively expensive from both spatial and temporal perspectives, impeding straightforward investigations over possible hypotheses characterizing molecular behavior. While the reduction in resolution of a study, such as moving from an atomistic simulation to that of the resolution of large coarse-grained (CG) groups of atoms, can partially ameliorate the cost of individual simulations, the relationship between the proposed microscopic details and this intermediate resolution is nontrivial and presents new obstacles to study. Small portions of these complex systems can be realistically simulated. Alone, these smaller simulations likely do not provide insight into collectively emergent behavior. However, by proposing that the driving forces in both smaller and larger systems (containing many related copies of the smaller system) have an explicit connection, systematic bottom-up CG techniques can be used to transfer CG hypotheses discovered using a smaller scale system to a larger system of primary interest. The proposed connection between different CG systems is prescribed by (i) the CG representation (mapping) and (ii) the functional form and parameters used to represent the CG energetics, which approximate potentials of mean force (PMFs). As a result, the design of CG methods that facilitate a variety of physically relevant representations, approximations, and force fields is critical to moving the frontier of systematic CG forward. Crucially, the proposed connection between the system used for parametrization and the system of interest is orthogonal to the optimization used to approximate the potential of mean force present in all systematic CG methods. The empirical efficacy of machine learning techniques on a variety of tasks provides strong motivation to consider these approaches for approximating the PMF and analyzing these approximations.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Alexander J. Pak
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Aleksander E. P. Durumeric
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Timothy D. Loose
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, Institute for Biophysical
Dynamics, and James Franck Institute, The
University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
16
|
López CA, Zhang X, Aydin F, Shrestha R, Van QN, Stanley CB, Carpenter TS, Nguyen K, Patel LA, Chen D, Burns V, Hengartner NW, Reddy TJE, Bhatia H, Di Natale F, Tran TH, Chan AH, Simanshu DK, Nissley DV, Streitz FH, Stephen AG, Turbyville TJ, Lightstone FC, Gnanakaran S, Ingólfsson HI, Neale C. Asynchronous Reciprocal Coupling of Martini 2.2 Coarse-Grained and CHARMM36 All-Atom Simulations in an Automated Multiscale Framework. J Chem Theory Comput 2022; 18:5025-5045. [PMID: 35866871 DOI: 10.1021/acs.jctc.2c00168] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The appeal of multiscale modeling approaches is predicated on the promise of combinatorial synergy. However, this promise can only be realized when distinct scales are combined with reciprocal consistency. Here, we consider multiscale molecular dynamics (MD) simulations that combine the accuracy and macromolecular flexibility accessible to fixed-charge all-atom (AA) representations with the sampling speed accessible to reductive, coarse-grained (CG) representations. AA-to-CG conversions are relatively straightforward because deterministic routines with unique outcomes are achievable. Conversely, CG-to-AA conversions have many solutions due to a surge in the number of degrees of freedom. While automated tools for biomolecular CG-to-AA transformation exist, we find that one popular option, called Backward, is prone to stochastic failure and the AA models that it does generate frequently have compromised protein structure and incorrect stereochemistry. Although these shortcomings can likely be circumvented by human intervention in isolated instances, automated multiscale coupling requires reliable and robust scale conversion. Here, we detail an extension to Multiscale Machine-learned Modeling Infrastructure (MuMMI), including an improved CG-to-AA conversion tool called sinceCG. This tool is reliable (∼98% weakly correlated repeat success rate), automatable (no unrecoverable hangs), and yields AA models that generally preserve protein secondary structure and maintain correct stereochemistry. We describe how the MuMMI framework identifies CG system configurations of interest, converts them to AA representations, and simulates them at the AA scale while on-the-fly analyses provide feedback to update CG parameters. Application to systems containing the peripheral membrane protein RAS and proximal components of RAF kinase on complex eight-component lipid bilayers with ∼1.5 million atoms is discussed in the context of MuMMI.
Collapse
Affiliation(s)
- Cesar A López
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Xiaohua Zhang
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Fikret Aydin
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Rebika Shrestha
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Que N Van
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Christopher B Stanley
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Timothy S Carpenter
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Kien Nguyen
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Lara A Patel
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - De Chen
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Violetta Burns
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicolas W Hengartner
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Tyler J E Reddy
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Harsh Bhatia
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Francesco Di Natale
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Timothy H Tran
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Albert H Chan
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Dhirendra K Simanshu
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Dwight V Nissley
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Frederick H Streitz
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Andrew G Stephen
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Thomas J Turbyville
- NCI RAS Initiative, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Felice C Lightstone
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Sandrasegaram Gnanakaran
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Helgi I Ingólfsson
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Chris Neale
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
17
|
Marrink SJ, Monticelli L, Melo MN, Alessandri R, Tieleman DP, Souza PCT. Two decades of Martini: Better beads, broader scope. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1620] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Siewert J. Marrink
- Groningen Biomolecular Sciences and Biotechnology Institute & Zernike Institute for Advanced Materials University of Groningen Groningen The Netherlands
| | - Luca Monticelli
- Molecular Microbiology and Structural Biochemistry (MMSB ‐ UMR 5086) CNRS & University of Lyon Lyon France
| | - Manuel N. Melo
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa Oeiras Portugal
| | - Riccardo Alessandri
- Pritzker School of Molecular Engineering University of Chicago Chicago Illinois USA
| | - D. Peter Tieleman
- Centre for Molecular Simulation and Department of Biological Sciences University of Calgary Alberta Canada
| | - Paulo C. T. Souza
- Molecular Microbiology and Structural Biochemistry (MMSB ‐ UMR 5086) CNRS & University of Lyon Lyon France
| |
Collapse
|